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Arto Salomaa 



Preface 



This Festschrift celebrates the 70th birthday of Arto Kustaa Salomaa (born 
in Turku, Finland on June 6, 1934), one of the most influential researchers in 
theoretical computer science. 

Most of his research concerns theory - he is one of the founding fathers of formal 
language and automata theory, but he has also made important contributions 
to cryptography and natural computing. His approach to research in theoretical 
computer science is exemplary and inspirational for his students, collaborators, 
and the readers of his papers and books. For him, the role of theory (in computer 
science) is to discover general rules of information processing that hold within 
computer science and in the world around us. One should not waste time on 
research concerning passing artifacts (or fashionable topics of the moment) in 
computer science - theory should be permanently predictive, insightful, and 
inspiring. That’s why we chose the title “Theory is Forever” . 

The main source of his influence on theoretical computer science is his publica- 
tions. Arto is a born writer - his papers and books are always most elegant. He 
has a unique gift for identifying the real essence of a research problem, and then 
presenting it in an incisive and eloquent way. He can write about a very involved 
formal topic and yet avoid a (much too common) overformalization. Many of 
his writings are genuine jewels and belong to the classics of theoretical computer 
science. They have inspired generations of students and researchers. Indeed, even 
computers as well as computer science have learned a lot from Arto’s publica- 
tions - this is nicely illustrated by DADARA on the cover of this volume. His 
writing talent extends beyond science - he writes beautiful and engaging sto- 
ries, and his close friends very much enjoy receiving his long, entertaining and 
informative letters. 

There is much other information that could be cited in this preface, such as the 
fact that he is one of the most celebrated computer scientists (e.g., he holds 
eight honorary degrees), or that he has been very instrumental in providing the 
organizational infrastructure for theoretical computer science in Europe (e.g., 
he is the past President of the European Association for Theoretical Computer 
Science), or that he is an absolute authority on the Finnish sauna (including 
both theory and practice). However, all of these accomplishments have been 
documented already in many places (e.g., in the companion book “Jewels are 
Forever”^ published on the occasion of Arto’s 65th birthday). Thus we have 
restricted ourselves to reflections on his research and writings. 

We are indebted to all the contributors for their tribute to Arto through this 
book. We ourselves have benefited enormously through many years of collabo- 

^ J. Karhumaki, H. Maurer, G. Paun, G. Rozenberg, Jewels are Forever, Contributions 
on Theoretical Computer Science in Honor of Arto Salomaa, Springer- Verlag, 1999. 
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ration with Arto from his guidance and friendship - editing this volume is just 
a token of our gratitude. We are also indebted to Mrs. Ingeborg Mayer from 
Springer- Verlag for the pleasant and efficient collaboration in producing this 
volume. As a matter of fact this collaboration is quite symbolic, as Arto has 
worked very closely with Springer- Verlag, especially with Mrs. Ingeborg Mayer 
and Dr. Hans Wossner, on many projects over many years. Finally, our special 
thanks go to T. Harju, M. Hirvensalo, A. Lepisto, and Kalle Saari for their work 
on this book. 

April 2004 Juhani Karhumaki 

Hermann Maurer 
Gheorghe Paun 
Grzegorz Rozenberg 
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Abstract. Ternary algebra has been used for detection of hazards in 
logic circuits since 1948. Process spaces have been introduced in 1995 
as abstract models of concurrent processes. Surprisingly, process spaces 
turned out to be special ternary algebras. We study symmetry in process 
spaces; this symmetry is analoguous to duality, but holds among three 
algebras. An important role is played here by the uncertainty partial 
order, which has been used since 1972 in algebras dealing with ambiguity. 
We prove that each process space consists of three isomorphic Boolean 
algebras and elements related to partitions of a set into three blocks. 



1 Introduction 

The concept of duality is well known in mathematics. In this paper we study a 
similar concept, but one that applies to three objects instead of two. The road 
that led to the discovery of these properties deserves to be briefly mentioned, 
because several diverse topics come together in this work. 

The usual tool for the analysis and design of digital circuits is Boolean al- 
gebra, based on two values. As early as 1948, however, it was recognized that 
three values are useful for describing certain phenomena in logic circuits [10]. We 
provide more information about the use of ternary algebra for hazard detection 
in Section 2. 

Ternary algebra is closely related to ternary logic [11]. This type of logic, 
allowing a third, ambiguous value in addition to true and false, was studied 
by Mukaidono in 1972 [12], who introduced the uncertainty partial order, in 
addition to the usual lattice partial order. This partial order turned out to be 
very useful; see, for example, [3, 6]. It also plays an important role in the ternary 
symmetry we are about to describe. 

In 1995 Negulescu [13] introduced process spaces as abstract models of con- 
current processes. Surprisingly, process spaces turned out to be special types of 
ternary algebras. It is in process spaces that “ternary duality” exists. Similar 
properties also hold in so-called linear logic, which has been used as another 
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framework for representing concurrent processes, and has connections to Petri 
nets [17]. This topic is outside the scope of the present paper. 

The remainder of the paper is structured as follows. Section 2 illustrates 
hazard detection using ternary algebra. We also recall some basic concepts from 
lattice theory and summarize the properties of ternary algebras. Process spaces 
are defined in Section 3. Ternary symmetry is next discussed in Section 4. In 
Section 5 we show that each process space contains three isomorphic Boolean 
algebras. Section 6 characterizes elements of a process space that are outside the 
Boolean algebras, and Section 7 summarizes our results. 

We assume that unary operations have precedence over binary operations. 
For example, —x H — y denotes {—x) + (— y). Sequences of unary operations are 
written without parentheses; for example, —j—x denotes — (/(— x)). Set inclusion 
is denoted by C and proper inclusion, by C. Proofs that are straightforward and 
involve only elementary set theory are omitted. 

2 Ternary Algebras 

The logic values are 0 and 1 , and a third value, denoted here by is used to rep- 
resent an intermediate or uncertain signal. This idea was used by many authors, 
but we mention here only Eichelberger’s 1965 ternary simulation algorithm [8] 
and its later characterizations [6] . More information about hazard detection can 
be found in a recent survey [4]. The following example illustrates the use of 
ternary simulation to detect hazards in logic circuits. 

Example 1. Consider the behavior of the circuit of Fig. 1(a) when its input x 
changes from 0 to 1. Initially, a: = 0, y = 1, and z = 0. After the transition, 
a: = 1, y = 0, and z = 0. Thus, z is not supposed to change during this transition. 
If the inverter has a sufficiently large delay, however, for a short time both inputs 
to the AND gate may be 1, and there may be a short 1-pulse in z. Such a pulse 
is undesirable, because it may cause an error in the computation. 

In the first part of the ternary simulation. Algorithm A, we change the input 
to which indicates that the input is first going through an intermediate, un- 
certain value. See Fig. 1(b); the first two entries on each line illustrate Algorithm 
A. The circuit is then analyzed in ternary algebra to determine which gates will 
undergo changes; the outputs of the changing gates become <l>. In our example, 
the inverter output becomes uncertain because its input is uncertain. Also, since 
one input of the and gate is 1 and the other uncertain, z becomes 

In the second part. Algorithm B, the input is changed to its final binary 
value, and the circuit is again simulated in ternary algebra. Some gate outputs 
that became <P in Algorithm A will become binary, while others remain <P. In our 
example, both y and z become 0; see the last two entries in Fig. 1(b). If a gate 
output has the same (binary) value in the initial state and also at the end of 
Algorithm B, then that output is not supposed to change during the transition 
in question. If, however, that output is after Algorithm A is applied, then 
we have detected a hazard, meaning that an undesired pulse may occur. This 
happens to the output z. □ 
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Fig. 1. Circuit with hazard: (a) binary analysis (b) ternary analysis 



We now recall some concepts from algebra. For more information about lat- 
tices see [1,7]. We use the following terminology. A semilattice [2] is an algebra 
(S', U), where S is a set and U is an idempotent, commutative and associative 
binary operation on S. We define the partial order Cu on S by 

X y xUy = y. 

A bisemilattice is an algebra (S, U, □) in which (S, U) and (S, □) are semilattices. 



Table 1. Laws of de Morgan Algebras 



Ml xU X = X 
M2 X U y — y Li X 
M3 xL\{yL\ z) = {xL\y)L\ z 
M4 xL\{xViy) = X 
M5 a; U T = a; 

M6 ® U T = T 

M7 xL\{yViz) = {xL\y)Vi{xL\z) 

M8 x = X 

M9 —{xLiy) = —xr\—y 



Ml' xn x = x 
M2' X n y = y n X 
M3' xVi[yViz) = {xViy)Viz 
M4' a; n (a; U y) = a; 

M5' a; n T = a; 

M6' X n T = T 

M7' X n (y U 2 ) = (x n y) U (x n 2 ) 
M9' — (x n y) = — X U — y 



i. e., laws M1-M3, Ml'-M3' of Table 1 hold. A bisemilattice has two partial orders 
□u and Cn, the latter defined by 

X En y ^ X r\ y = X. 

If a bisemilattice satisfies the absorption laws M4 and M4', then it is a lattice. 
The two partial orders Eu and En then coincide, and are denoted by E- The 
converse of E is denoted by □ • The operations U and □ are the join and meet of 
the lattice, respectively. A lattice is bounded if it has greatest and least elements 
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T (top) and _L (bottom) satisfying M5, M 6 , M5', M 6 '. A bounded lattice is 
represented by (S', U, FI, _L, T). A lattice satisfying the distributive laws M7 and 
M7' is distributive. 

A de Morgan algebra is an algebra (S, U, , _L, T), where (S, U, □, _L, T) 
is a bounded distributive lattice, and — is a unary operation, called quasi- 
eomplement, that satisfies M 8 and de Morgan’s laws M9 and M9'. 

A Boolean algebra is a de Morgan algebra (S, U, □, — , _L, T), which also sat- 
isfies the complement laws: 



X U —X = T 



X n —X = _L 



A ternary algebra (S, U, □, — , _L, d>,T) is a de Morgan algebra (S, U, □, — , _L,T) 
with an additional constant satisfying 

T1 -<l> = ^ 

T2 (x U —x) U<!> = xU —X T2' (x □ —x) □ ^ = a: □ —x 

For more information about ternary algebras the reader is referred to [5, 6 , 
9, 12]. Here, we mention only the uncertainty partial order and the subset-pair 
representation of ternary algebras. 

Figure 2(a) shows the lattice order C of the 3-element ternary algebra T 3 = 
({_L, T}, U, n, — , _L, T), and Fig. 2(b), its uncertainty partial order ^ [ 6 , 

12], where <P represents the unknown or uncertain value, and _L and T are the 
known or certain values. 



O 




A T 



A (b) 

Fig. 2 . Partial orders in T3: (a) C (b) X 



For any x,y G T3, the least upper bound of {x, y} in the partial order A can 
be expressed as (a; n y) U ((a; U y) n <d>) [ 6 ] . We extend this to any ternary algebra 
(S', U, n, — , A, T) by defining the binary operation V [3] as 

x\J y = (a: n y) U ((a; U y) n <?). 

It is easily verified that (S, V) is a semilattice. The semilattice partial order is 



x^y^x\/y = y. 
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Let £ be a nonempty set, and a collection of ordered pairs {X,X') of 
subsets of £ such that X \J X' = For {X, X'), {Y, Y') G let 



{X, X') U {Y,Y') = {X nY, X' UY'), 


(1) 


{X, X') n (y, y') = {xuY,x'n y'), 


(2) 


-{X,X') = {X',X). 


(3) 



Let _L = (£, 0), <1> = (£, £), and T = (0, £). Then (T, U, FI, — , T) is a subset- 
pair algebra [5] if IP is closed under U, □, and — , and contains the constants _L, 
and T. The following result was shown in [5,9]: 

Theorem 1. Every subset-pair algebra is a ternary algebra, and every ternary 
algebra is isomorphic to a subset-pair algebra. 

It is easy to verify that 



{X, X') c (y, y') ^ ^ and X' c y', (4) 

{X,X')y {Y,Y') = {X\JY,X'\JY'), (5) 

{X, X') < (y, Y')-^XCY and X' C Y' . (6) 



3 Process Spaces 

The material in this section is based on [14, 15]. The discussion of applications 
of process spaces is beyond the scope of this paper, and we treat process spaces 
only as mathematical objects. However, we do give a simple example to motivate 
the reader. 

Let £ be any nonempty set; a process x over £ is an ordered pair x = (X, X') 
of subsets of £ such that X U X' = £. 

We refer to £ as a set of executions . Several different examples of execution 
sets have been used [14,15]. For the purposes of this paper, however, we may 
think of £ as the set of all sequences of actions from some action universe U; thus 
£ = U*. A process x = (X,X') represents a contract between a device and its 
environment: the device guarantees that only executions from X occur, and the 
environment guarantees that only executions from X' occur. Thus, X = £ \ X 
is the set of executions in which the device violates the contract. Similarly, for 
executions in X', the environment violates the contract. The condition XUX' = 
£, or equivalently X DY = 0, means that the blame for violating the contract 
can be assigned to either the device or the environment, but not both. The set 
X is called the set of accessible executions of x, and X' is the set of acceptable 
executions. 
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Example 2. Figure 3 (a) shows a symbol for a buffer, and Fig. 3 (b) shows a 
sequential machine describing its behavior. The buffer starts in the state marked 
by an incoming arrow. If it receives a signal on its input a, it moves to a new state. 
It is expected to respond by producing a signal on its output b and returning 
to the original state. Thus, the normal operation of the buffer consists of an 
alternating sequence of a’s and 6’s starting with a. The two states involved in 
this normal operation are marked g, representing the fact that they are the goal 
states of the process. 




(a) 



a^b ^ I a,b 

(b) 

Fig. 3. Buffer process: (a) block diagram (b) behavior 



It is possible that the environment of the buffer does not behave according to 
the specified goal, and produces two consecutive a’s in the initial state. From the 
point of view of the buffer, this environment behavior can be rejected as illegal; 
hence the state diagram moves to a reject state marked r, and remains in that 
state thereafter. It is also possible that the buffer malfunctions by producing b in 
the initial state. This is a violation of the contract by the buffer, and the process 
moves to the state labelled e; such executions have been called the escapes of 
the process. 

Let Lg be the set of all words taking the machine of Fig. 3 (b) to a state 
marked g, and let Lg and Lr be defined similarly. One verifies that Lg = (a6)*(e 
U a), where e is the empty word, Le = {ab)*b{a U 6)*, and = {ab)*aa{a U6)*. 
The buffer process is {X, X') = {Lg U Lg, Lg U Lr). □ 

The process space over £ is denoted by Tg, and it is the set of all processes 
over £. Note that each set £ defines a unique process space. 

In constructing lp£ we must put each element of £ in X or X' or both. Hence, 
if £ has cardinality n, then ?£ has cardinality 3”. The smallest process space has 
three elements. If £ = {!}, say, then the three processes are: ({!}, 0), ({!}, {!}), 
and (0, {I}). 

In every process space we identify three special elements: bottom, T = (£, 0), 
void, = (£,£), and top, T = (0, £). 
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Next, we define the main operations on processes. These operations are mo- 
tivated by applications to concurrent systems, and are related to operations in 
several theories of concurrency. For more details see [14, 15]. 

— Reflection, defined as in (3). Refiection permutes the roles of the device and 
the environment. If a process x = {X, X') is the contract seen by the device, 
then —X = {X' , X) represents the same contract as seen by the environment. 

— Refinement, defined as in (4). If a; = {X,X'), y = {Y,Y'), and x ^ y, 
then y is an acceptable substitute for x, because y accesses fewer executions 
than X, i.e., its device obeys tighter constraints {Y C X), and accepts more 
executions than x, i.e., its environment has weaker constraints {Y' D X'). 

— Product, written x , is a binary operation such that 

{x,x') X {Y,Y') = {X nY, {X' n y') u (xnr)) . (7) 

Product models a system formed by two devices operating jointly. The sys- 
tem’s accessible executions are those that are accessible to both components. 
Its set of acceptable executions consists of the executions that are accept- 
able to both components and those that must be avoided by one of the 
components. 

Refinement is a partial order which induces a lattice over a process space, 
whose join is given by (1) and meet, by (2). Furthermore, this lattice has T 
and T as bounds. This, together with refiection, which is defined as in (3), 
makes an arbitrary process space (Tg, U, FI, — , T, <?, T) a subset-pair algebra, 
and, by Theorem 1, a ternary algebra. Refiection is an involution and it reverses 
refinement. Thus 



X = X, 


(8) 


X Q y —X □ —y. 


(9) 



One verifies that (Tg, x,^, T) is a semilattice with identity <P and greatest 
element T. 

An example of a process space is shown in Fig. 4. Here £ = {1,2}, and 
Tg = Pg has nine elements. Its partial orders F and ^ are shown in the figure. 
To simplify the notation, we denote {1,2} simply by 12, etc. 



4 Ternary Symmetry 

Process spaces admit a ternary symmetry [14, 16] based on a unary operation, 
called rotation (/), and defined by: 

/x=(X\j1C,X), (10) 



for all X = {X, X'). 
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Proposition 1. For any processes x and y we have: 



II (x = X, 


(11) 


xxy = //(/xn /y), 


(12) 


lx = II -X, 1 -x = -//x, 


(13) 


A = <?>, A = T, IT = T. 


(14) 



Proposition 1 shows that / is bijective, since it is a root of identity, and 
therefore admits an inverse, namely //. Furthermore, Prop. 1 reveals that map 
/ is an isomorphism of the semilattices (lp£, x) and (fPe,!!). 

The ternary symmetry brought out by Prop. 1 justifies an alternate repre- 
sentation of processes in a process space as (set) triplets [13, 14], defined below. 
For process x = {X,X'), we also write x = {Xx, X2, X^) where Xi = X \ X' , 
X2 = Xf\ X) and X3 = X' \ X. 

Note that the entries in a set triplet “split” £ in the following sense. If 
a; = {Xi,X2,X^), then XiHXa = = 0 and X1UX2UX3 = £. 

(This “split” is not a “partition” because some of the blocks might be empty.) 

We redefine below the main operations on processes in the set triplet rep- 
resentation. These definitions are equivalent to the definition on set pairs given 
previously. 

(Xi,X2,X3) X (yi,P2,P3) = ((Xi\r3)u(Yi\X3),X2nr2,^3uy3). (i5) 



(Xi, X 2 , X 3 ) E (Pi, P2, P3) Pi E Pi A X 3 c ^ 3 . (16) 

Note that there is no condition on P2 and P2. 

-(Pi,P2,P3) = (P3,P2,Pi). (17) 

T = (£,0,0), <?=(0,£,0), T = (0,0,£). (18) 

Rotation has a simpler form in the set triplet representation: 

/(Pi,P 2,P3) = (P3,Pl,P2). (19) 
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One also verifies that the operation //of double rotation, being the composition 
of two single rotations, satisfies 

I/x={X',X\j1C) (20) 

and, equivalently, 

//(Xl,X2,X3) = (X2,X3,Xi). (21) 

Several new operations are defined using ternary symmetry. Each definition 
relates two operations in a way similar to that between x and n in Prop. 1. For 
completeness, we repeat (12). 

Definition 1. For arbitrary processes x and y from process space ?£ 

xxy = fl{/xnly), 
x®y = //{/x X /y), 
x + y = UUxU !y), 
x®y = //Ux + /y). 

The refinement partial order C is related to the uncertainty partial order, as 
shown below. This is remarkable because the notions of refinement and uncer- 
tainty as formally defined here were motivated by totally different applications. 

Proposition 2. For arbitrary processes x and y from process space Tg 

X d: y ^ lx Q ly. 



5 Boolean Trios 

Let £ be any nonempty set and let T = Te be the process space on £. Let 
(T, U, n, — , T, <P, T) be the process space viewed as a ternary algebra. Let Qt be 
the set of all elements comparable to <P in T. 

Theorem 2. The structure (Qt, LI, , T, T) is a sub-ternary-algebra of 
(T,u,n,-,T,<I>, T). 

Proof. It is easy to verify that Qt is closed under the two binary operations, 
and contains the three constants. Since de Morgan’s laws hold, we have x Q y 
—X □ —y. In particular, x Q <P —x □ T>, in view of Tl. Hence Qt is also 
closed under — . □ 

Let Qm be the set of all the processes of T that are minimal in the uncertainty 
partial order Since {X,X') d {Y,Y') if and only ii X Y and X' C Y' , 
a process x = {X, X') is minimal if and only ii Xt^X' = %. Otherwise, if there is 
a common element in X and X' , we can remove it from X (or X'), and obtain 
a smaller process. Thus, if t G Qm, then x has the form x = (X, X), for some 
X C Note that Qm includes T and T. 

Let (T, U, n, — , T, T) be the process space (T, U, n, — , T, T) viewed as a 
de Morgan algebra. 
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Theorem 3. The structure (Qm, U, , _L, T) is a sub-de-Morgan-algebra of 
(?, U, n, — , _L, T). Furthermore, (Qm, U, , _L, T) is a Boolean algebra. 

Proof. Let x,y be in Qm- Then x = and y = (Y,Y) for some X and 

Y. The n xHy = (X U Y,X nY) = {X U Y,XUY), x U y = {X nY,X UY) = 
(X n F, X n Y), i.e., Qm is closed under both binary operations. Furthermore, 
—x = (X,X), and Qm is also closed under — . We have already noted that Qm 
contains _L and T. 

Laws M1-M9, and Ml'-M7', M9' hold because they hold in IP. Hence we 
need only to verify the complement laws. We have x □ —x = {X, X) □ {X , X) = 
{X U X,X D X) = (£, 0) = _L. Similarly, x U —x = T. □ 

We now consider the set of all the processes in Qt that are below (p. Let 
(IP, U, n, — , _L, T) be a process space, and let Ql = {x & T \ x Q <?}. If 
X G Ql, then x has the form x = (£,Pf'), for some X' C £. Note that Ql 
contains _L and T>, and that 

/x=(W,£). (22) 

Secondly, consider the set of all elements above T>. Let Qu = {a; G IP | a; □ <F}. 
If a; G Qu, then x = {X, £) for some X C £.. Note that Ql contains and T, 
and that 

/x = (X,X), (23) 

for all X = {£,X') in Ql. 

Finally, recall that elements of Qm have the form x = {X,X) and note that 

/x={t,X). (24) 

If Q C £, we define /Q = {jq \ q € Q} and — Q = {—q \ q G Q}. 

Proposition 3. /Ql = Qu, /Qu = Qm, and /Qm = Ql- 

Proof. In view of (22)-(24), we have /Ql Q Qu, /Qu C Qm, and /Qm Ql- 
Since // /x = x, it follows that Qu = / / /Qu Q / /Qm /Ql- Thus /Ql = Qu- 
The other two equalities follow similarly. □ 

Proposition 4 . For x,y G Ql, 

(a) a:U— /a; = ^, a:n— /a; = _L, 

(b) /{xUy) = /xU /y, / {x H y) = /x H /y, 

(c) xUy = x'^y = x + y, xr\y = xxy = x(By- 

Theorem 4 . (Ql, LI, FI, — /, _L, <P) is a Boolean algebra with join U, meet □, com- 
plement — / , least element _L and greatest element <F. 

Proof. We verify that Ql is closed under U,n and — /, and contains _L and 
<F. Next we check that the laws of Boolean algebra hold for Ql. Laws Ml- 
M7, Ml'-M7' hold in Ql, since they hold in IP. The complement laws hold by 
Prop. 4 (a). The involution and de Morgan’s laws follow easily using the fact 
that each element in Ql is of the form x = {£, X'). □ 
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Proposition 5. For x,y G Qjj, 

(a) a;U/ — a; = T, x F\ / — x = 

(b) /{xUy) = /xn /y, /{x H y) = /x U /y, 

(c) xUy = xxy = x(By, xr\y = x + y = x^y. 

Theorem 5. {Qu, LI, □, /— , F, T) is a Boolean algebra with join U, meet □, com- 
plement /— , least element <P and greatest element T. Moreover, the algebras 
(Ql, U, n, — /, _L, <?) anrf (Q (75 L, n, /—,<?, T) are isomorphic, an isomorphism be- 
ing I ■■ Ql ^ Qu- 

Proof. By Prop. 1, / is a bijection. By Prop. 4 (b), / preserves U and □. Also 
/ {—/x) = / — {/x), showing that / maps complements correctly. Finally, /_L = <F, 
and /<F = T . □ 

Proposition 6 . For x,y G Qm, 

(a) x n —X = _L, a: U —x = T, 

(b) /{xHy) = /xU /y, /{xUy) = /xH /y, 

(c) xr\y = x(By = x-\-y, xUy = xxy = x®y. 

Theorem 6 . {Qm, L, U, — , T, _L) is a Boolean algebra with join □, meet U, com- 
plement — , least element T and greatest element _L. Moreover, {Qu, L, H, — , <F, T) 
and (Qm, n, U, — /, T, _L) are isomorphic, an isomorphism being / : Qu Qm- 

Proof. The first claim follows by Theorem 3 and duality in Boolean algebras. 
Mapping / is a bijection, which behaves like an isomorphism with respect to 
the binary operations because of Prop. 5 (b). For the unary operation, we have 
/{/ — x) = II — x = —{!x), as required. Finally, /^ = T and /T = _L. □ 

In a similar fashion we verify the following: 

Theorem 7. (Qm, L, U, — /, T, _L) and (Ql, U, FI, — /, _L, <?) are isomorphic, an 
isomorphism being / : Qm ^ Ql- 

We refer to the three Boolean algebras of a process space as its Boolean trio. 
The Basse diagram for the partial order Cl within the Boolean algebras of the 27- 
element process space P 27 is shown in Fig. 5. Note that rotation of the Boolean 
algebras is counterclockwise, whereas rotation of the complement operations in 
the three algebras is clockwise, since we have — , , and //— = — /. 

We close this section by showing that — is an isomorphism between Qu and 
the dual oiQu. 

Theorems. (Ql, U, FI, — /, _L, and (Q[/, F, U, /— , T, are isomorphic, an 
isomorphism being — Qu ^ Qu- 

Proof. Since x ^ ^ —x □ <F, we have Qu = —Ql- Next, —{xUy) = —x F —y, 

and —{x r\y) = —x U —y. Also, —{—fx) = f fx = fx = f x = / — (— x), 

as required. Finally, — _L = T, and —<P = <P. □ 
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(0,123) 




Fig. 5. Boolean trio of P 27 



By duality of Boolean algebras, algebra (Qc/, n, U, /— , T, is isomorphic to 

Exercise. We now offer the readers an exercise to check their understanding 
of the concepts presented. To simplify the notation, we introduce the following 
symbols: A = (123,0), B = (123,1), etc., as shown in Fig. 5. The reader who 
evaluates the following expressions will be richly rewarded (example: IB = I): 
/A, l/H, -Y, PUG, //-/; 

-/G, -/(QUR), -QU-P, /M, I//-H, -(/UJ), //-H, -//; 

/O, YU/M, P®R, -/-A. □ 

6 Ordered Tripartitions 

We now consider elements x = (X,X') = (Xi, X 2 , X 3 ) that are outside the 
Boolean trio. Let T be the set of all such elements; these are elements of T that 
are incomparable to <P and are not minimal in the partial order ^ . 

Proposition 7. T = {(X, X')\0C X, X' C d, and X U X' ^ 0}. 

We refer to partitions of a set into three blocks as tripartitions of the set. An 
ordered tripartition of £ is an ordered triple (Xi, X 2 , X 3 ) of subsets of £, such 
that {Ai, A 2 , A 3 } is a tripartition of £. 

Proposition 8 . T = {(Ai,A 2 ,A 3 ) | {Ai,A 2 , A 3 } is a tripartition o/£}. 

A sextet is an algebra (Sq, / , — ), where Sq is a set of six elements, and / and — 

are unary operations satisfying j j jx = x, x = x, and — jx= / /—x,ior all x G 

S'e. An example of a sextet is shown in Fig. 6 (a). Here = {0, 1, . . . , 5}, /x = 
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x + 2 (mod 6) and —x = b — x, where the — on the right-hand side is subtraction 
of integers. Figure 6(b) shows another example of a sextet. Here, Sq consists 
of all the ordered tripartitions generated by the tripartition {{1, 4}, {2}, {3}} 
of £ = {1,2, 3, 4} under the two unary operations / and — , the rotation and 
reflection of triplets. 

0 _ 5 ( 14 , 2 , 3 ) _ ( 3 , 2 , 14 ) 





(a) (b) 



Fig. 6. Illustrating sextets 



Proposition 9 . T is a disjoint union of sextets generated by all tripartitions 
of£- 

If £ has cardinality n, there are 3” elements in (Pg. In each Boolean algebra 
of the trio there are 2” elements, for a total of 3 x (2” — 1) elements in the trio. 
Thus there are 3" — 3 x (2” — 1) elements in T. For n = 1 and n = 2, T is empty. 
For n = 3, there are six elements in T. These belong to the sextet generated by 
the tripartition {{1}, {2}, {3}}. For n = 4, there are 36 elements belonging to 
the sextets generated by the six tripartitions {{1, 2}, {3}, {4}}, {{1, 3}, {2}, {4}}, 
{{1, 4}, {2}, {3}}, {{1}, {2, 3}, {4}}, {{1}, {2, 4}, {3}}, {{1}, {2}, {3, 4}}. 

7 Conclusions 

We have demonstrated a ternary symmetry similar to duality. We have shown 
that every process space consists of a trio of Boolean algebras and a disjoint 
union of sextets generated by all tripartitions of the underlying set. 
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Abstract. For more than 2000 years, from Pythagoras and Euclid to 
Hilbert and Bourbaki, mathematical proofs were essentially based on 
axiomatic-deductive reasoning. In the last decades, the increasing length 
and complexity of many mathematical proofs led to the expansion of 
some empirical, experimental, psychological and social aspects, yester- 
day only marginal, but now changing radically the very essence of proof. 
In this paper, we try to organize this evolution, to distinguish its differ- 
ent steps and aspects, and to evaluate its advantages and shortcomings. 
Axiomatic-deductive proofs are not a posteriori work, a luxury we can 
marginalize nor are computer-assisted proofs bad mathematics. There is 
hope for integration! 



1 Introduction 

From Pythagoras and Euclid to Hilbert and Bourbaki, mathematical proofs were 
essentially based on axiomatic-deductive reasoning. In the last decades, the in- 
creasing length and complexity of many mathematical proofs led to the expan- 
sion of some empirical, experimental, psychological and social aspects, yesterday 
only marginal, but now changing radically the very essence of proof. Computer- 
assisted proofs and the multiplication of the number of authors of a proof became 
in this way unavoidable. 

In this paper, we try to organize this evolution, to distinguish its different 
steps and aspects and to evaluate its advantages and shortcomings. Various 
criticisms of this evolution, particularly, Ian Stewart’s claim according to which 
the use of computer programs in a mathematical proof makes it as ugly “as a 
telephone directory” while purely axiomatic-deductive proofs are “beautiful like 
Tolstoy’s War and Peace” , will be discussed. 

As axiomatic-deductive proofs, computer-assisted proofs may oscillate be- 
tween ugliness and beauty. The elegance of a computer-program may rival the 
beauty of a piece of poetry, as the author of the Art of Computer Programming 
convinced us; however, this may not exclude the possibility that a computer- 
program assisting a proof hides a central idea or obscures the global aspect of 
the proof. In particular, the program assisting a proof may not be itself “proven 
correct”, as it happened in the proof of the four-color problem, even in the latest, 
improved 1996 variant. 
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Computer-assisted proofs are to usual axiomatic-deductive proofs what (high- 
school) algebraic approaches are to arithmetic approaches or what analytical ap- 
proaches are to direct geometric approaches. Arithmetic and intuitive geometry 
make children’s brains more active, but algebra and analytic geometry, leading 
to routine and general formulas, diminish the intellectual effort and free their 
brains for new, more difficult problems. Obviously, each of these approaches has 
advantages and shortcomings, its beauty and ugliness; they are not antitheti- 
cal, but complementary. Axiomatic-deductive proofs are not a posteriori work, 
a luxury we can marginalize nor are computer-assisted proofs bad mathematics. 
There is hope for integration! 



2 Proofs in General 

Proofs are used in everyday life and they may have nothing to do with mathe- 
matics. There is a whole field of research, at the intersection of logic, linguistics, 
law, psychology, sociology, literary theory etc., concerning the way people argue: 
argumentation theory. Sometimes, this is a subject taught to 15 or 16 year-old 
students. 

In the “Oxford American Dictionary” [16] we read: 

Proof: 1. a fact or thing that shows or helps to show that something is true 
or exists; 2. a demonstration of the truth of something, “in proof of my state- 
ment”; 3. the process of testing whether something is true or good or valid, 

“put it to the proof’ . To prove: to give or be proof of; to establish the validity 
of; to be found to be, “it proved to be a good theory”; to test or stay out. To 
argue: 1. to express disagreement, to exchange angry words; 2. to give reasons 
for or against something, to debate; 3. to persuade by talking, “argued him 
into going” ; 4. to indicate, “their style of living argues that they are well off’ . 
Argument: 1. a discussion involving disagreement, a quarrel; 2. a reason put 
forward; 3. a theme or chain of reasoning. 

In all these statements, nothing is said about the means used “to show or help 
to show that something is true or exists” , about the means used “in the process 
of testing whether something is true or good or valid” . In argumentation theory, 
various ways to argue are discussed, deductive reasoning being only one of them. 
The literature in this respect goes from classical rhetorics to recent developments 
such as [28]. People argue by all means. We use suggestions, impressions, emo- 
tions, logic, gestures, mimicry, etc. 

What is the relation between proof in general and proof in mathematics? 
It seems that the longer a mathematical proof is, the higher the possibility to 
contain elements usually belonging to non-mathematical proofs. We have in view 
emotional, affective, intuitive, social elements related to fatigue, memory gaps, 
etc. Long proofs are not necessarily computational; the proof of Fermat’s theorem 
and the proof of Bieberbach’s conjecture did not use computer programs, but 
they paid a price for their long lengths. 
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3 Prom Proofs to Mathematical Proofs 

Why did mathematical proofs, beginning with Thales, Pythagoras and Euclid, 
till recently, use only deductive reasoning? First of all, deduction, syllogistic 
reasoning is the most visible aspect of a mathematical proof, but not the only 
one. Observation, intuition, experiment, visual representations, induction, anal- 
ogy and examples have their role; some of them belong to the preliminary steps, 
whose presence is not made explicit, but without which proofs cannot be con- 
ceived. As a matter of fact, neither deduction, nor experiment could be com- 
pletely absent in a proof, be it the way it was conceived in Babylonian mathemat- 
ics, predominantly empirical, or in Greek mathematics, predominantly logical. 
The problem is one of proportion. In the 1970s, for the first time in the history 
of mathematics, empirical-experimental tools, under the form of some computer 
programs, have penetrated massively in mathematics and led to a solution of 
the four-color problem (4CP), a solution which is still an object of debate and 
controversy, see Appel and Haken [1], Tymoczko [39], Swart [37], Marcus [27], 
and A. Calude [10].^ 

Clearly, any proof, be it mathematical or not, is a very heterogeneous process, 
where different ingredients are involved in various degrees. The increasing role of 
empirical-experimental factors may recall the Babylonian mathematics, with the 
significant difference that the deductive component, today impressive, was then 
very poor. But what is the difference between ‘proof’ and ‘mathematical proof’? 
The difficulty of this question is related to the fact that proofs which are not 
typically mathematical may occur in mathematics too, while some mathematical 
reasonings may occur in non-mathematical contexts. Many combinatorial real- 
life situations require a mathematical approach, while games like chess require 
deductive thinking (although chess thinking seems to be much more than de- 
duction) . In order to identify the nature of a mathematical proof we should first 
delimit the idea of a ‘mathematical statement’, i.e. a statement that requires a 
mathematical proof. Most statements in everyday life are not of this type. Even 
most statements of the type ‘if . . . , then . . . ’ are not mathematical statements. 
At what moment does mathematics enter the scene? The answer is related to 
the conceptual status of the involved terms and predicates. Usually, problems 
raised by non-mathematicians are not yet mathematical problems, they may be 
farther or nearer to this status. The problem raised to Kepler, about the densest 
packing, in a container, of some apples of similar dimensions, was very near to 
a mathematical one and it was easy to find its mathematical version. The task 
was more difficult for the 4CP, where things like ‘map’, ‘colors’, ‘neighbor’, and 
‘country’ required some delicate analysis until their mathematical models were 
identified. On the other hand, a question such as ‘do you love me?’ still remains 
far from a mathematical modelling process. 



® We have discussed in detail this issue in a previous article [11]. 
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4 Where Does the Job of Mathematicians Begin? 

Is the transition from statements in general to mathematical statements the job 
of mathematicians? Mathematicians are divided in answering this question. Hugo 
Steinhaus’s answer was definitely yes, Paul Erdos’s answer was clearly negative. 
The former liked to see in any piece of reality a potential mathematical problem, 
the latter liked to deal with problems already formulated in a clear mathematical 
language. Many intermediate situations are possible, and they give rise to a whole 
typology of mathematicians. Goethe’s remark about mathematicians’ habit of 
translating into their own language what you tell them and making in this way 
your question completely hermetic refers just to this transition, sometimes of 
high difficulty. 

If in mathematical research both above attitudes are interesting, useful and 
equally important, in the field of mathematical education of the general public 
the yes attitude seems more important than the negative one and deserves prior- 
ity. The social failure of mathematics to be recognized as a cultural enterprise is 
due, to a large extent, to the insufficient attention paid to its links to other fields 
of knowledge and creativity. This means that, in general mathematical education, 
besides the scenario with definitions-axioms-lemmas-theorems-proofs-corollaries- 
examples-applications we should consider, with at least the same attention, the 
scenario stressing problems, concepts, examples, ideas, motivations, the histor- 
ical and cultural context, including links to other fields and ways of thinking. 
Are these two scenarios incompatible? Not at all. It happens that the second 
scenario was systematically neglected; but the historical reasons for this mistake 
will not be discussed here (see more in [29,30]). 

Going back to proof, perhaps the most important task of mathematical ed- 
ucation is to explain why, in many circumstances, informal statements of prob- 
lems and informal proofs are not sufficient; then, how informal statements can 
be translated into mathematical ones. This task is genuinely related to the ex- 
planation of what is the mathematical way of thinking, in all its variants: combi- 
natorial, deductive, inductive, analogical, metaphorical, recursive, algorithmic, 
probabilistic, infinite, topological, binary, triadic, etc., and, above all, the step- 
by-step procedure leading to the need to use some means transcending the nat- 
ural language (artificial symbols of various types and their combinations). 



5 Proofs: Prom Pride to Arrogance 

With Euclid’s Elements, for a long time taken to be a model of rigor, mathe- 
maticians became proud of their science, claimed to be the only one giving the 
feeling of certainty, of complete confidence in its statements and ways of arguing. 
Despite some mishaps occurring in the 19th century and in the first half of the 
20th century, mathematicians continued to trust in axiomatic-deductive rigor, 
with the improvements brought by Hilbert’s ideas on axiomatics and formaliza- 
tion. With Bourbaki’s approach, towards the middle of the 20th century, some 
mathematicians changed pride into arrogance, imposing a ritual excluding any 
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concession to non-formal arguments. ‘Mathematics’ means ‘proof’ and ‘proof’ 
means ‘formal proof’, is the new slogan. 

Depuis les Grecs, qui dit Mathematique, dit demonstration 

is Bourbaki’s slogan, while Mac Lane’s [25] austere doctrine reads 

If a result has not yet been given valid proof, it isn’t yet mathematics: we 
should strive to make it such. 

Here, the proof is conceived according to the standards established by Hilbert, 
for whom a proof is a demonstrative text starting from axioms and where each 
step is obtained from the preceding ones, by using some pre-established explicit 
inference rules: 

The rules should he so clear, that if somebody gives you what they claim 
is a proof, there is a mechanical procedure that will check whether the 
proof is correct or not, whether it obeys the rules or not. 

And according to Jaffe and Quinn [20] 

Modern mathematics is nearly characterized by the use of rigorous proofs. 

This practice, the result of literally thousands of years of refinement, has 
brought to mathematics a clarity and reliability unmatched by any other 
science. 

This is a linear-growth model of mathematics (see Stoltzner [36]), a process 
in two stages. First, informal ideas are guessed and developed, conjectures are 
made, and outlines of justifications are suggested. Secondly, conjectures and 
speculations are tested and corrected; they are made reliable by proving them. 
The main goal of proof is to provide reliability to mathematical claims. The act 
of finding a proof often yields, as a by-product, new insights and possibly unex- 
pected new data. So, by making sure that every step is correct, one can tell once 
and for all whether a theorem has been proved. Simple! A moment of reflection 
shows that the case may not be so simple. For example, what if the “agent” (hu- 
man or computer) checking a proof for correctness makes a mistake (as pointed 
out by Lakatos [24], agents are fallible)? Obviously, another agent has to check 
that the agent doing the checking did not make any mistake. Some other agent 
will need to check that agent, and so on. Eventually either the process continues 
unendingly (an unrealistic scenario?), or one runs out of agents who could check 
the proof and, in principle, they could all have made a mistake! Finally, the 
linear-growth model is built on an asymmetry of proof and conjecture: Posing 
the latter does not necessarily involve proof. 

The Hilbert-Bourbaki model has its own critics, some from outside mathe- 
matics such as Lakatos [24] 

. . . those who, because of the usual deductive presentation of mathemat- 
ics, come to believe that the path of discovery is from axiom and/or 
definitions to proofs and theorems, may completely forget about the pos- 
sibility and importance of naive guessing 
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some from eminent mathematicians as Atiyah [2] : 

[20] present a sanitized view of mathematics which condemns the subject 
to an arthritic old age. They see an inexorable increase in standards and 
are embarrassed by earlier periods of sloppy reasoning. But if mathemat- 
ics is to rejuvenate itself and break new ground it will have to allow for 
the exploration of new ideas and techniques which, in their creative phase, 
are likely to be dubious as in some of the great eras of the past. Perhaps 
we now have high standards of proof to aim at but, in the early stages 
of new developments, we must be prepared to act in more buccaneering 
style. 

Atiyah’s point meets Lakatos’s [24] views 

. . . informal, quasi- empirical mathematics does not grow through a mo- 
notonous increase of the number of indubitably established theorems, but 
through the incessant improvement of guesses by speculation and criti- 
cism, by the logic of proof and refutation 

and is consistent with the idea that the linear-growth model tacitly requires a 
‘quasi-empirical’ ontology, as noted by Hirsch in his contribution to the debate 
reported in [2]: 

For if we don’t assume that mathematical speculations are about ‘reality ’ 
then the analogy with physics is greatly weakened — and there is no rea- 
son to suggest that a speculative mathematical argument is a theory of 
anything, any more than a poem or novel is ‘theoretical’. 

6 Proofs: Prom Arrogance to Prudence 

It is well-known that the doubt appeared in respect to the Hilbert-Bourbaki 
rigor was caused by Godel’s incompleteness theorem,^ see, for instance, Kline’s 
Mathematics, the Loss of Certainty [21]. It is not by chance that a similar title 
was used later by Ilya Prigogine in respect to the development of physics. So, 
arrogance was more and more replaced by prudence. All rigid attitudes, based 
on binary predicates, no longer correspond to the new reality, and they should be 
considered ‘cum grano salis’. The decisive step in this respect was accomplished 
by the spread of empirical-experimental factors in the development of proofs. 

^ The result has generated a variety of reactions, ranging from pessimism (the final, 
definite failure of any attempt to formalise all of mathematics) to optimism (a guar- 
antee that mathematics will go on forever) or simple dismissal (as irrelevant for 
the practice of mathematics). See more in Barrow [4], Chaitin [12] and Rozenberg 
and Salomaa [34] . The main pragmatical conclusion seems to be that ‘mathematical 
knowledge’, whatever this may mean, cannot solely be derived only from some fixed 
rules. Then, who validates the ‘mathematical knowledge’? Wittgenstein’s answer 
was that the acceptability ultimately comes from the collective opinion of the social 
group of people practising mathematics. 




Mathematical Proofs at a Crossroad? 



21 



7 Assisted Proofs Vs. Long Proofs, or from Prudence to 
Humility 

The first major step was realized in 1976, with the discovery, using a massive 
computer computation, of a proof of the 4CP. This event should be related to 
another one: the increasing length of some mathematical proofs. Obviously, the 
length l{p{s)) of the proof p{s) of the statement s should be appreciated in 
respect to the length l{s) of s. There is a proposal to require the existence of 
a strictly positive constant k such that, for any reasonable theorem, the ratio 
l{p{s))/l{s) is situated between 1/fc and k. But the existence of such a k may 
remain an eternal challenge. 

In the past, theorems with too long a statement were very rare. Early ex- 
amples of this type can be found in Apollonius’s Conica written some time 
after 200 BC. More recent examples include some theorems by Arnaud Denjoy, 
proved in the first decades of the 20th century, and Jordan’s theorem (1870) 
concerning the way a simple closed curve c separates the plane in two domains 
whose common frontier is c. A strong trend towards long proofs appears in the 
second half of the 20th century. We exclude here the artificial situation when 
theorems with long statements and long proofs can be decomposed into several 
theorems, with normal lengths. We refer to statements having a clear meaning, 
whose unity and coherence are lost if they are not maintained in their initial 
form. The 4CP is just of this type. Kepler’s conjecture is of the same type and 
so are Fermat’s theorem, Poincare’s conjecture and Riemann’s hypothesis. What 
about the theorem giving the classification of finite simple groups? In contrast 
with the preceding examples, in this case the statement of the theorem is very 
long. It may be interesting to observe that some theorems which are in complete 
agreement with our intuition, like Jordan’s and Kepler’s, require long proofs, 
while some other theorems, in conflict with our intuition, such as the theorem 
asserting the existence of three domains in the plane having the same frontier, 
have a short proof. Ultimately, everything depends on the way the mathematical 
text is segmented in various pieces. 

The proof of the theorem giving the typology of the finite simple groups re- 
quired a total of about fifteen thousand pages, spread in five-hundred separate 
articles belonging to about three- hundred different authors (see Conder [15]). 
But Serre [31] is still waiting for experts to check the claim by Aschbacher and 
Smith to have succeeded filling in the gap in the proof of the classification theo- 
rem, a gap already discovered in 1980 by Daniel Gorenstein. The gap concerned 
that part which deals with ‘quasi-thin’ groups. Despite this persisting doubt, 
most parts of the global proof were already published in various prestigious jour- 
nals. The ambition of rigor was transgressed by the realities of mathematical life. 
Moreover, while each author had personal control of his own contribution (ex- 
cepting the mentioned gap), the general belief was that the only person having 
a global, holistic representation and understanding of this theorem was Daniel 
Gorenstein, who unfortunately died in 1992. So, the classification theorem is still 
looking for its validity and understanding. 
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The story of the classification theorem points out the dramatic fate of some 
mathematical truths, whose recognition may depend on sociological factors which 
are no longer under the control of the mathematical community. This situation 
is not isolated. Think of Fermat’s theorem, whose proof (by Wiles) was checked 
by a small number of specialists in the field, but the fact that here we had several 
‘Gorensteins’, not only one, does not essentially change the situation. 

How do exceedingly long proofs compare with assisted proofs? In 1996 Robert- 
son, Sanders, Seymour and Thomas [32] offered a simpler proof of the 4CP. They 
conclude with the following interesting comment (p. 24): 

We should mention that both our programs use only integer arithmetic, 
and so we need not he concerned with round-off errors and similar dan- 
gers of floating point arithmetic. However, an argument can he made that 
our “proof” is not a proof in the traditional sense, because it contains 
steps that can never he verified by humans. In particular, we have not 
proved the correctness of the compiler we compiled our programs on, nor 
have we proved the infallibility of the hardware we ran our programs on. 
These have to he taken on faith, and are conceivably a source of error. 
However, from a practical point of view, the chance of a computer er- 
ror that appears consistently in exactly the same way on all runs of our 
programs on all the compilers under all the operating systems that our 
programs run on is infinitesimally small compared to the chance of a 
human error during the same amount of case-checking. Apart from this 
hypothetical possibility of a computer consistently giving an incorrect an- 
swer, the rest of our proof can he verified in the same way as traditional 
mathematical proofs. We concede, however, that verifying a computer 
program is much more difficult than checking a mathematical proof of 
the same length.® 

Knuth [22] p. 18 confirms the opinion expressed in the last lines of the pre- 
vious paragraph: 

. . . program-writing is substantially more demanding than book-writing. 

Why is this so? I think the main reason is that a larger attention span is 
needed when working on a large computer program than when doing other 
intellectual tasks. . . . Another reason is .. . that programming demands 
a significantly higher standard of accuracy. Things don’t simply have to 
make sense to another human being, they must make sense to a computer. 

And indeed, Knuth compared his T[;]X compiler (a document of about 500 pages) 
with Feit and Thompson’s [17] theorem that all simple groups of odd order 
are cyclic. He lucidly argues that the program might not incorporate as much 
creativity and “daring” as the proof of the theorem, but they come even when 
compared on depth of detail, length and paradigms involved. What distinguishes 
the program from the proof is the “verification” : convincing a couple of (human) 
experts that the proof works in principle seems to be easier than making sure that 

® Our emphasis. 




Mathematical Proofs at a Crossroad? 



23 



the program really works. A demonstration that there exists a way to compile 
Tp]K is not enough! Hence Knuth’s warning: “Beware of bugs in the above code: 
I have only proved it correct, not tried it.” 

It is just the moment to ask, together with R. Graham: “If no human being 
can ever hope to check a proof, is it really a proof?” Continuing this question, we 
may ask: What about the fate of a mathematical theorem whose understanding is 
in the hands of only a few persons? Let us observe that in both cases discussed 
above (4CP and the classification theorem) it is not only the global, holistic 
understanding under question, but also its local validity. 

Another example of humility some eminent mathematicians are forced to 
adopt with respect to yesterday’s high exigency of rigor was given recently by 
one of the most prestigious mathematical journals, situated for a long time at 
the top of mathematical creativity: Annals of Mathematics. We learn from Karl 
Sigmund [35] that the proof proposed by Thomas Hales in August 1998 and the 
corresponding joint paper by Hales and Ferguson confirming Kepler’s conjecture 
about the densest possible packing of unit spheres into a container, was accepted 
for publication in the Annals of Mathematics, 

hut with an introductory remark by the editors, a disclaimer as it were, 
stating that they had been unable to verify the correctness of the 250-page 
manuscript with absolute certainty. 

The proof is so long and based to such an extent on massive computations, that 
the platoon of mathematicians charged with the task of checking it ran out of 
steam. Robert MacPherson, the Annals’ editor in charge of the project, stated 
that “the referees put a level of energy into this that is, in my experience, un- 
precedented. But they ended up being only 99 percent certain that the proof was 
correct”. However, not only the referees, the author himself, Thomas Hales, ‘was 
exhausted’, as Sigmund observes. He was advised to re-write the manuscript: he 
didn’t, but instead he started another project, ‘Formal Proof of Kepler’ (FPK), a 
project which puts theorem- verification on equal footing with Knuth’s program- 
verification. Programming a machine to check human reasoning gives a new type 
of insight which has its own kind of beauty. Here is the bitter-ironical comment 
by Sigmund: 

After computer-based theorem-proving, this is the next great leap forward: 
computer-based proof checking. Pushed to the limit, this would seem to 
entail a self -referential loop. Maybe the purists who insist that a proof is 
a proof if they can understand it are right after all. On the other hand, 
computer-based refereeing is such a promising concept, for reviewers, 
editors, and authors alike, that it seems unthinkable that the community 
will not succumb to the temptation. 

So, what is the perspective? It appears that FPK will require 20 man-years 
to check every single step of Hales’ proof. “If all goes well, we then can be 100 
percent certain”, concludes Sigmund ([35], p. 67). 
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Let us recall that Perelman’s recent proof of Poincare’s conjecture® is still 
being checked at MIT (Cambridge) and IHES (Paris) and who knows how long 
this process will be? We enter a period in which mathematical assessment will 
increase in importance and will use, in its turn, computational means. The job 
of an increasing number of mathematicians will be to check the work of other 
mathematicians. We have to learn to reward this very difficult work, to pay it 
at its correct value. 

One could think that the new trend fits the linear-growth model: all experi- 
ments, computations and simulations, no matter how clever and powerful, belong 
to and are to stay at the first stage of mathematical research where informality 
and guessing are dominant. This is not the case. Of course, some automated 
heuristics will belong only to the first stage. The shift is produced when a large 
part of the results produced by computing experiments are transferred to the 
second stage; they no longer only develop the intuition, they no longer only build 
hypotheses, but they assist the very process of proof , from discovery to checking, 
they create a new type of environment in which mathematicians can undertake 
mathematical research. 

For some a proof including computer programs is like a telephone directory, 
while a human proof may compete with a beautiful novel. This analogy refers to 
the exclusive syntactic nature of a computer-based proof (where we learn that the 
respective proof is valid, but we may not (don’t) understand why), contrasting 
with the attention paid to the semantic aspect, to the understanding process, 
in the traditional proofs, exclusively made by humans.^ The criticism implied 
by this analogy, which is very strong in Rene Thom’s writings, is not always 
motivated. In fact, an ‘elegant’ program® may help the understanding process 
of mathematical facts in a completely new way. We confine ourselves to a few 
examples only: 

. . .if one can program a computer to perform some part of mathematics, 
then in a very effective sense one does understand that part of mathe- 
matics (G. Tee [38]) 

If I can give an abstract proof of something. I’m reasonably happy. But if 
I can get a concrete, computational proof and actually produce numbers 
I’m much happier. I’m rather an addict of doing things on computer, 
because that gives you an explicit criterion of what’s going on. I have a 
visual way of thinking, and I’m happy if I can see a picture of what I’m 
working with (J. Milnor, [7]) 

. . . computer-based proofs are often more convincing than many standard 
proofs based on diagrams which are claimed to commute, arrows which 

® Mathematicians familiar with Perelman’s work expect that it will be difficult to 
locate any substantial mistakes, cf. Robinson [33]. 

^ The conjugate pair rigor- meaning deserves to be reconsidered, cf. Marcus [26]. 

® Knuth’s concept of treating a program as a piece of literature, addressed to human 
beings rather than to a computer; see [23]. 
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are supposed to be the same, and arguments which are left to the reader 
(J.-P. Serre [31]) 

. . . the computer changes epistemology, it changes the meaning of “to 
understand. ” To me, you understand something only if you can program 
it. (G. Chaitin [14]) 

It is the right moment to reject the idea that computer-based proofs are nec- 
essarily ugly and opaque not only to being checked for their correctness, but also 
to being understood in their essence. 

Finally, do axiomatic-deductive proofs remain an a posteriori work, a luxury 
we can marginalize? When asked whether “when you are doing mathematics, 
can you know that something is true even before you have the proof?”, Serre 
([31], p. 212) answers: “Of course, this is very common”. But he adds: “But one 
should distinguish between the genuine goal [. . . ] which one feels is surely true, 
and the auxiliary statements (lemmas, etc.), which may well be intractable (as 
happened to Wiles in his first attempt) or even downright false [. . . ].” 

8 A Possible Readership Crisis and the Globalization of 
the Proving Process 

Another aspect of very long (human or computer-assisted) proofs is the risk of 
finding no competent reader for them, no professional mathematician ready to 
spend a long period to check them. This happened with the famous Bieberbach 
conjecture. In 1916, L. Bieberbach conjectured a necessary condition on an ana- 
lytic function to map the unit disk injectively to itself. The statement concerns 
the (normalised) Taylor coefficients o„ of such a function (oq = 0,oi = 1): it 
then states that |a„| is at most n, for any positive integer n. Various mathe- 
maticians succeeded in proving the required inequality for particular values of 
n, but not for every n. In March 1984, Louis de Branges (from Purdue Univer- 
sity, Lafayette) claimed a proof, but nobody trusted him, because previously he 
made wrong claims for other open problems. Moreover, nobody in USA agreed 
to read his 400-pages manuscript to check his proof, representing seven years 
of hard work. The readership crisis ended when Louis de Branges proposed to 
the Russian mathematician I. M. Milin that he check the proof; Milin was the 
author of a conjecture implying Bieberbach’s conjecture. De Branges travelled 
to Leningrad, where after a period of three months of confrontation with a team 
formed by Milin and two other Russian mathematicians, E. G. Emelianov and 
G. V. Kuzmina, they all reached the conclusion that various mistakes existing in 
the proof were all benign. Stimulated by this fact, two German mathematicians, 
G. F. Gerald and G. Pommerenke (Technical University, Berlin) succeeded in 
simplifying De Branges’s proof. 

This example is very significant for the globalization of mathematical re- 
search, a result of the globalization of communication and of international co- 
operation. It is no exaggeration to say that mathematical proof has now a global 
dimension. 
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9 Experimental Mathematics or the Hope for It 

The emergence of powerful mathematical computing environments such as Math- 
ematica, MathLab, or Maple, the increasing availability of powerful (multi- 
processor) computers, and the omnipresence of the Internet allowing mathe- 
maticians to proceed heuristically and ‘quasi-inductively’, have created a blend 
of logical and empirical-experimental arguments which is called “quasi-empirical 
mathematics” (by Tymoczko [39], Chaitin [13]) or “experimental mathematics” 
(Borwein, Bailey [8], Borwein, Bailey, Girgensohn [9]). Mathematicians increas- 
ingly use symbolic and numerical computation, visualisation tools, simulation 
and data-mining. New types of proofs motivated by the experimental “ideol- 
ogy” have appeared. For example, the interactive proof (see Goldwasser, Micali, 
Backoff [18], Blum [5]) or the holographic proof (see Babai [3]). And, of course, 
these new developments have put the classical idea of axiomatic-deductive proof 
under siege (see [11] for a detailed discussion). 

Two programatic ‘institutions’ are symptomatic for the new trend: the Cen- 
tre for Experimental and Constructive Mathematics (CECM)f and the journal 
Experimental Mathematics}^ Here are their working ‘philosophies’: 

At CECM we are interested in developing methods for exploiting math- 
ematical computation as a tool in the development of mathematical in- 
tuition, in hypothesis building, in the generation of symbolically assisted 
proofs, and in the construction of a flexible computer environment in 
which researchers and research students can undertake such research. 
That is, in doing experimental mathematics. [6] 

Experimental Mathematics publishes formal results inspired by exper- 
imentation, conjectures suggested by experiments, surveys of areas of 
mathematics from the experimental point of view, descriptions of algo- 
rithms and software for mathematical exploration, and general articles 
of interest to the community. 

For centuries mathematicians have used experiments, some leading to im- 
portant discoveries: the Gibbs phenomenon in Fourier analysis, the determinis- 
tic chaos phenomenon, fractals. Wolfram’s extensive computer experiments in 
theoretical physics paved the way for his discovery of simple programs having 
extremely complicated behavior [40]. Experimental mathematics — as system- 
atic mathematical experimentation ranging from hypotheses building to assisted 
proofs and automated proof-checking — will play an increasingly important role 
and will become part of the mainstream of mathematics. There are many reasons 
for this trend: they range from logical (the absolute truth simply doesn’t exist), 
sociological (correctness is not absolute as mathematics advances by making mis- 
takes and correcting and re-correcting them), economic (powerful computers will 
be accessible to more and more people), and psychological (results and success 

® www.cecm.sfu.ca. 

WWW . expmath . org . 
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inspire emulation). The computer is the essential, but not the only tool. New the- 
oretical concepts will emerge, for example, the systematic search for new axioms. 
Assisted-proofs are not only useful and correct, but they have their own beauty 
and elegance, impossible to find in classical proofs. The experimental trend is 
not antithetical to the axiomatic-deductive approach, it complements it. Nor is 
the axiomatic-deductive proof a posteriori work, a luxury we can marginalize. 
There is hope for integration! 



Acknowledgment 

We are grateful to Greg Chaitin and Garry Tee for useful comments and refer- 
ences. 

References 

1. K. Appel, W. Haken. Every Planar Graph is Four Colorable, Contemporary Math- 
ematics 98, AMS, Providence, 1989. 

2. M. Atiyah et al. Responses to ‘Theoretical mathematics: Toward a cultural synthe- 
sis of mathematics and theoretical physics’, Bulletin of AMS 30 (1994), 178-211. 

3. L. Babai. Probably true theorems, cry wolf? Notices of AMS 41 (5) (1994), 453- 
454. 

4. J. Barrow. Impossibility-The Limits of Science and the Science of Limits, Oxford 
University Press, Oxford, 1998. 

5. M. Blum. How to prove a theorem so no one else can claim it. Proceedings of 
the International Congress of Mathematicians, Berkeley, California, USA, 1986, 
1444-1451. 

6. J. M. Borwein. Experimental Mathematics and Integer Relations at www. 
ercim.org/publication/Ercim_News/enw50/borwein.html. 

7. J. M. Borwein. www.cecm.sfu.ca/personal/jborwein/CRM.html. 

8. J. M. Borwein, D. Bailey. Mathematics by Experiment: Plausible Reasoning in the 
21st Century, A.EK.E Peters, Natick, MA, 2003. 

9. J. M. Borwein, D. Bailey, R. Girgensohn. Experimentation in Mathematics: Com- 
putational Paths to Discovery, A.EK.E Peters, Natick, MA, 2004. 

10. A. S. Calude. The journey of the four colour theorem through time. The NZ Math. 
Magazine 38, 3 (2001), 27-35. 

11. C. S. Calude, E. Calude, S. Marcus. Passages of Proof, Los Alamos preprint 
archive, arXiv:math.H0/0305213, 16 May 2003. 

12. G. J. Chaitin. The Unknowable, Springer Verlag, Singapore, 1999. 

13. G. J. Chaitin. Exploring Randomness, Springer Verlag, London, 2001. 

14. G. J. Chaitin. Meta Math!, E-book at www.cs.auckland.ac.nz/CDMTCS/chaitin/ 
omega.html. 

15. M. Conder. Pure mathematics: An art? or an experimental science? NZ Science 
Review 51, 3 (1994), 99-102. 

16. E. Ehrlich, S. B. Flexner, G. Carruth, J. M. Hawkins. Oxford American Dictio- 
nary, Avon Publishers of Bard, Camelot, Discus and Flare Books, New York, 
1982. 

17. W. Feit, J. G. Thomson. Solvability of groups of odd order, Pacific J. Math. 13 
(1963), 775-1029. 




28 



Cristian S. Calude and Solomon Marcus 



18. S. Goldwasser, S. Micali, C. Rackoff. The knowledge complexity of interactive 
proof-systems, SIAM J. Comput., 18(1) (1989), 186-208. 

19. R. Hersh. What Is Mathematics, Really?, Vintage, London, 1997. 

20. A. Jaffe and F. Quinn. Theoretical mathematics: Toward a cultural synthesis of 
mathematics and theoretical physics. Bulletin of AMS 29 (1993), 178-211. 

21. M. Kline. Mathematics: The Loss of Certainty, Oxford University Press, Oxford, 
1982. 

22. D. E. Knuth. Theory and practice, EATCS Bull. 27 (1985), 14-21. 

23. D. E. Knuth. Literate Programming, CSLI Lecture Notes, no. 27, Stanford, Cali- 
fornia, 1992. 

24. I. Lakatos. Proofs and Refutations. The Logic of Mathematical Discovery, John 
Worrall and Elie Zahar (eds.), Cambridge University Press, Cambridge, 1966. 

25. S. Mac Lane. Despite physicists, proof is essential in mathematics, Synthese 111 
(1997), 147-154. 

26. S. Marcus. No system can be improved in all respects, in C. Altmann, W. Koch 
(eds.) Systems; New Paradigms for the Human Sciences, Walter de Cruyter, 
Berlin, 1998, 143-164. 

27. S. Marcus. Ways of Thinking, Scientific and Encyclopedic Publ. House, Bucharest, 
1987. (in Romanian) 

28. C. Perelman, L. Olbrechts-Tyteca. Traite de V Argumentation. La Nouvelle 
Rhetorique, Editions de I’Universite de Bruxelles, Bruxelles, 1988. 

29. G. Polya. How to Solve It, Princeton University Press, Princeton, 1957. (2nd 
edition) 

30. G. Polya. Mathematics and Plausible Reasoning, Volume 1: Induction and Analogy 
in Mathematics, Volume 2: Patterns of Plausible Inference, Princeton University 
Press, Princeton, 1990. (reprint edition) 

31. M. Raussen, C. Skau. Interview with Jean-Pierre Serre, Notices of AMS, 51, 2 
(2004), 210-214. 

32. N. Robertson, D. Sanders, P. Seymour, R. Thomas. A new proof of the four-colour 
theorem. Electronic Research Announcements of AMS 2,1 (1996), 17-25. 

33. S. Robinson. Russian reports he has solved a celebrated math problem. The New 
York Times, April 15 (2003), p.ED3. 

34. G. Rozenberg, A. Salomaa. Cornerstones of Undecidability, Prentice-Hall, New 
York, 1994. 

35. K. Sigmund. Review of George G. Szpiro. “Kepler’s Conjecture”, Wiley, 2003, 
Mathematical Intelligencer, 26, 1 (2004), 66-67. 

36. M. Stoltzner. What Lakatos could teach the mathematical physicist, in G. 
Kampis, L. Kvasz, M. Stoltzner (eds.). Appraising Lakatos. Mathematics, Method- 
ology and the Man, Kluwer, Dordrecht, 2002, 157-188. 

37. E. R. Swart. The philosophical implications of the four-colour problem, American 
Math. Monthly 87, 9 (1980), 697-702. 

38. G. J. Tee. Computers and mathematics. The NZ Math. Magazine 24, 3 (1987), 
3-9. 

39. T. Tymoczko. The four-colour problem and its philosophical significance, J. Phi- 
losophy 2,2 (1979), 57-83. 

40. S. Wolfram. A New Kind of Science, Wolfram Media, 2002. 

41. Experimental Mathematics: Statement of Philosophy, www.expmath.org/expmath 
/philosophy .html. 




Rational Relations as Rational Series 



Christian Choffrut 



LIAFA, UMR 7089, Universite Paris 7 
2 PI. Jussieu, Paris Cedex 75251 
France 

ccSliaf a. Jussieu. fr 



Abstract. A rational relation is a rational subset of the direct product 
of two free monoids: i? C A* x B* . Consider i? as a function of A* into 
the family of subsets of B* by posing for all u £ A* , R{u) = {u € B* | 
(u,v) € B}. Assume B(u) is a finite set for all u € A*. We study how 
the cardinality of B(u) behaves as the length of u tends to infinity and 
we show that there exists an infinite hierachy of growth functions. 

Keywords: free monoid, rational relation, rational series. 



1 Introduction 

It is a elementary result in mathematics that the n-th term of a sequence of reals 
satisfying a linear recurrence equation 

Un = 0,\Un-l + a2Un-2 + . . . + akUn-k 

is asymptotically equivalent to a linear combination of expressions of the form 
P{n)X^~^ where A is a root of the characteristic polynomial of the recurrence, 
k its multiplicity and P(n) a polynomial of degree fc — 1, cf. [4, Theorem 6.8] or 
[6, Lemma II. 9. 7]. Not less known is the fact that the m„’s are the coefficients 
of a rational series in one variable, or equivalently of the infinite expansion of 
the quotient of two polynomials on the field of the reals. The natural extension 
to rational series in a finite number of non-commuting variables is completely 
solved in [7] where it is shown that when the growth function of the coefficients is 
subexponential, it is polynomial with positive integer exponent. For exponential 
growth, some indication can be found in [10]. 

The purpose of this paper is concerned with a less classical extension. We still 
consider rational series in non-commuting variables but the coefficients belong 
to the family of rational subsets of a free monoid and we study the asymptotic 
behaviour (in some precise way which is specified later) of the coefficients. 

2 Preliminaries 

We refer the reader to the textbooks [1,3,6] for all definitions which are not 
recalled here, such as the notions of semiring, finite automaton, and the like. 
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2.1 Rational Series 

We denote hy A a finite set of letters (the alphabet) and by A* the free monoid 
it generates. An element u of A* is called a word or a string. Its length |m| is the 
number of letters occurring in u. The empty word has length 0 and is denoted 
by 1. 

Given a semiring K, we denote by K((A*)) the monoid algebra of A* over 
the semiring K. We view its elements as formal sums (i.e., series) of terms of 
the form ku where fc G K and u € A*. This algebra is provided with the usual 
operations of sum, product and star restricted to the elements whose constant 
term is zero. The family RatK(A*) of rational series over the semiring K is the 
smallest family of series containing the series reduced to the constant 0 and 
the terms ka for all a G A and closed under the operations of sum, product 
and restricted star. By Kleene’s Theorem we know that any rational subset is 
recognized by a finite automaton with multiplicities [6, 5]. Here, we are concerned 
with the case when K is the semiring of rational subsets of the free monoid B* 
for some finite alphabet B, denoted by Rat®i3* where 23 is the Boolean semiring 
{0, 1}, or more succinctly RatR*. 

2.2 Rational Relations 

We recall that a relation R C A* x B* is rational if it is a rational subset of the 
product monoid A* x B*, i.e., if it belongs to the smallest family of subsets of 
the monoid A* x B*, containing the singletons and closed under the operations 
of subset union, product {X -Y = {x-y\x€X,y€Y} where the product is 
meant componentwise) and star {X* = Un>o^")- 

The connection with the rational series is stated in the following basic result, 
cf., [5, Theorem I 1.7.]. 

Proposition 1. A relation R C A* x B* is rational if and only if the series 
R{u) = {v € B* \ (u,v) G i?} is rational over the semiring RatB* . 

This Proposition breaks the symmetry between the input alphabet A and 
the output alphabet B. The set {v G B* \ (u,v) G R} is called the image of 
u G A*. The domain of R, denoted by Dom(i?), is the subset of strings u G A* 
whose image is non-empty. Assuming all input strings have finite image, we say 
that the rational relation R has asymptotic growth function g : N — > N, if the 
following two conditions are satisfied (||A|| denotes the cardinality of A) 

1) ||i?(w)|| = 0{g{\w\) holds for all w G Domi? 

2) for some infinite (length-) increasing sequence of words (wn)n>o we have 
||i?(w„)|| = 0{g{\wn\) 

Observe that the hypothesis that all input words have finite image is not a 
strong requirement, since it can be easily shown that the restriction of a rational 
relation to the subset of input strings with finite image, i.e., the relation i?<oo = 
{(m, f) G A* X B* I {u,v) G R and ||i?(M)|| < oo} is a rational subset of A* x B* . 
The problem is first studied by Schiitzenberger in [8]. 
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2.3 Finite Transducer 

Rational relations can be computed by a construct which is a natural extension 
of a finite automaton. 

A transducer is a quadruple T = {Q,Q-,Q+, fi) where Q is the finite set 
of states, Q- C Q (resp. Q+ C Q) the set of initial (resp. final) states and 
: A* ^ Rat{B*)Q^Q is a linear representation of the monoid A* into the 
multiplicative monoid of square matrices in the semiring Rat(i3*). Observe that 
the input and output alphabets A and B are understood in the definition of a 
transducer. 

A linear representation is traditionally pictured as a finite labeled graph. E. 
g., consider the representation p, : {a,b}* — > Rat{6}*^^^ defined by 

/ 100 \ 

p{a) =060 /i(6) =001 

V00iy V001/ 

Identify the rows and colu;ns with the integers 1, 2, 3 and choose Q- = {1} and 
Q_i_ = {3}. It should be clear how to pass from the matrix representation to 
the following graph representation (by abuse of notation, the sets consisting of 
a unique word are identified with this word, i.e., we write 6, instead of the more 
rigorous {6}). 



a, 6/1 a/6 a, 6/1 




The relation computed hy T, denoted ||T||, is the relation: {(u,u) £ A* x B*\ 
3q- € Q-,q+ G Q+,v G p-q_,q+{u)}. In the above example, the relation com- 
puted by the transducer is {(u,a”) € A* x a* \ u G A*ba‘^bA*}. 

Since we are concerned with rational relations for which every input has finite 
image, all entries of the matrices fi{a) for a G A are finite subsets of B* . 

3 An Infinite Hierachy 

The rational relations with finite but unbounded image, i.e., for which there is 
no integer N such that | |i?(u) 1 1 < holds for all u G Domi?, are characterized in 
[9] by two local conditions on the structure of the tansducer. A relatively direct 
consequence is that if the growth is subexponential, then it is bounded by 0(n^) 
for some integer k. We shall prove in this note a more precise result by showing 
that for each integer k, there exists a rational relation whose growth function is 
in 6(n^). 
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In order to exhibit an infinite hierarchy, consider the alphabets A = {oi, . . . , 
Gk} and B = {6, c}. Set Wi = A* — QiA* — A*Oi+i for i = 1, . . . , fc — 1 with the 
convention Wq = A* — A*ai and Wk = A* — au A* . Define the relation 



ii(w) = \w ...al'^Wk} (1) 

for all w G {af aj . . . a^)* (where as usual, we set = XX*). Observe that the 
relation is indeed rational. We draw a transducer computing the relation when 
k = 2 and let the reader guess why we do not draw it for values of k greater 
than 2. 



02/1 ai /1 02/6 




Proposition 2. We have ||i?(r<;)|| = 0{\w\'^) for all w G Domi?. Further- 
more, for some infinite (length-) increasing sequence of words (wn)n>o we have 
||i?(rr;„)|| = 0(K|t). 

Proof. Let us first prove the last claim. Consider the words Wn = n Ki<n ®1®2’ 

. . . ■ a).. Then we have R{wn) = . . . c6"''= | 1 < ni < «2 ■ • ■ < < n}. 

A simple computation leads to \wn\ = 9 {^n^) and ||i?(w„)|| = which com- 
pletes the verification. 

Let us now turn to the main claim and set 

K = lim supjr | = 0(1) for all w G (a+ ... 

The previous claim shows that K > In order to prove the equality let us 
make a few observations. To that order, consider the standard decomposition of 
an arbitrary word of the domain of R. 



w = 



n n «: 



= a. 



... a. 



... a. 



. . . a 



rkn 

k 



Call spectrum of w the function which assigns to each 1 < * < A:, the number 
cr^(z) of different exponents rij, with 1 < j < n. Let N be the maximum of the 
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cr„(z)’s when i from 1 to /c. Then the number of elements in the image of w is 
in 0{N^). It now suffices to give a lower bound for the length of w. 

Observation 1: We may assume that {r^ | 1 < j < n} = {1, . . . , CTu.(z)} holds for 
\<i<k. 

Indeed, for a fixed 1 < i < k, the bijection which to each exponent rij 
associates its rank among the exponents of the letter at, does not increase the 
length of the word and does not modify the cardinality of its image (e. g., the 
sequence 4, 2, 5, 4, 7 of exponents would be normalized as 2, 1, 3, 2, 4). 

The second observation is obvious. 

Observation 2: We may assume that for each 1 < z < fc there exists rz — (Jiu(z) + 1 
occurrences of exponent 1 and one occurrence of each exponent 2, . . . , CTu,(z). In 
particular, if z is the value of the index which achieves the maximum (z) (equal 
to N), the length jwla, in the letter is at least equal to 

As a consequence, by observation 2 the length |zc| of a word w whose spec- 
trum satisfies max{cTuj(z) | 1 < z < fc} = A^, is not less than Since the 

cardinality of R{w) under this hypothesis is in 0{N^), the proof is completed. 

□ 



Actually we may relax the condition on the number of generators of the free 
monoid and establish the same result on the binary alphabet A = {a, 5}. For 
each integer k, consider the family 3?^ of rational relations with subexponential 
growth 

i? C A* X A* is in 3?^ if and only if ||i?(t<;)|| = 0(|zc| 2 ) (2) 

Theorem 1. The hierachy 2 is strict. 

Proof. Consider the relation in (1) and define E = {{aPa, Oi) | z = 1, . . . , k}* . 
Then the composition if o i? is a rational relation with growth function 9(n^). 

□ 



4 Further Developments 

There are plenty of possible variations of the problem of the asymptotic growth 
of rational series. We may wish to study more general semirings for the coeffi- 
cients (provided we can assciate a numerical value, such as the cardinality, as in 
this note) or study rational series over more general than free monoids, or more 
ambitiously, extend both the coefficient semiring and the monoid simultaneously. 
For example, consider an N-rational series over the direct product of two free 
monoids A* and B*. Given such a series s, denote by the coefficient asso- 
ciated with the pair (u,v) € A* x B*: s = ues* ^(u,v)(u, v). The growth 

function of the coefficients of the series is the function g : N — > N defined as 
g(n) = max{s(u,v) I |m| -I- |u| = n}. The example worked out by Wich in order 
to exhibit a logarithmic degree of ambiguity for linear context-free languages 
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can be directly interpreted in terms of such series. Indeed, consider the following 
three unambiguous rational series of the direct product {a,b}* x {a,b}*. 

G = {(a*i6a*=^6...a*"6,a2*i6a2*=6) | n > 0, ti, Z 2 , . . . > 0} 

M = {a"6, 1) I n > 0} 

R = . . . a*"6) | n > 0, zi, Z 2 , • ■ • z„ > 0} 

By considering pairs of the form 

{aba^b . . .a^ b, a^ba^b . . .a^ b) 

for some integer fc, it is shown that the product GMR has logarithmic growth, 
see [11] for details. 

More surprisingly, the simplest case of N-rational series in k commuting vari- 
ables is not settled, at least as far as we know. The result in [2] proves in a 
special case that the asymptotic growth can be of the form ^ for some real A. 



5 Open Problems 

Problem 1. Prove or disprove that every rational relation whose growth is 
subexponential has an asymptotic growth function of the form jzcja for some 
integer fc > 0. 

Problem 2. Does there exist an algorithm for computing the exponent of the 
asymptotic growth function of a subexponential rational relation? 
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Abstract. Watson-Crick DOL systems {W DOL systems) are variants of 
DOL systems with controlled derivations, inspired by the phenomenon 
of Watson-Crick complementarity of the familiar double helix of DNA. 
These systems are defined over a DNA-like alphabet, i.e. each letter has 
a complementary letter and this relation is symmetric. Depending on a 
special condition, called the trigger, a parallel rewriting step is applied 
either to the string or to its complementary string. A network of Watson- 
Crick DOL systems (an NW DOL system) is a finite set of W DOL systems 
over a common DNA-like alphabet which act on their own strings in 
parallel and after each derivation step send copies some of the generated 
words to the other nodes. In [2] it was shown that the so-called standard 
NW DOL systems form a class of computationally complete devices, that 
is, any recursively enumerable language can be determined by a network 
of standard Watson-Crick DOL systems. In this paper we prove that the 
computational power of these constructs does not change in the case of 
a certain type of incomplete information communication, namely where 
the communicated word is a non-empty prefix of the generated word. An 
analogous statement can be given for the case where the communicated 
word is a non-empty suffix of the string. 



1 Introduction 

Watson-Crick complementarity, motivated by the well-known characteristics of 
the familiar double helix of DNA, is a fundamental concept in DNA computing. 
According to this phenomenon, two DNA strands form a double strand if they 
are complement of each other. A notion, called a Watson-Crick DOL system 
(a WDOL system), where the paradigm of complementarity is considered in 
the operational sense, was introduced and proposed for further investigations 
in [8,9]. 
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A Watson-Crick DQL system is a DQL system over a so-called DNA-like al- 
phabet S and a mapping (f>, called the trigger for complementarity transition. 
In a DNA-like alphabet each letter has a complementary letter and this relation 
is symmetric. The letters of a DNA-like alphabet are called purines and pyrim- 
idines, a terminology extended from the DNA-alphabet of the four nucleotides 
A, C, T, and G which are the sequence elements forming DNA strands. The 
complementary letter of each purine is a pyrimidine and the complementary let- 
ter of each pyrimidine is a purine. The trigger is a logical- valued mapping over 
the set of strings over the DNA-like alphabet with the following property: the 
0- value of the axiom is 0, and whenever the (/)- value of a string is 1, then the 
0- value of its complementary word must be 0. (The complement of a string is 
obtained by replacing each letter with its complementary letter.) The derivation 
in the Watson-Crick DOL system is defined as follows: when the new string is 
computed by applying the morphism of the DOL system, then it is checked ac- 
cording to the trigger. If the (()- value of the obtained string is 0 (the string is 
a so-called good word), then the derivation continues in the usual manner. If 
the obtained string is a so-called bad one, that is, its Rvalue is equal to 1, then 
the string is changed for its complement and the derivation continues with this 
complementary string. 

The idea behind the concept is the following: in the course of a computa- 
tional or a developmental process things can go wrong to such extent that it is 
advisable to continue with the complementary string, which is always available 
[12]. Watson-Crick complementarity is viewed as an operation: together with or 
instead of a word we consider its complementary word. 

Particularly important variants of Watson-Crick DOL systems are the so- 
called standard Watson-Crick DOL systems {SWDOL systems). The controlled 
derivation in a standard Watson-Crick DOL system is defined as follows: after 
rewriting the string by applying rules of the DOL system in parallel, the number 
of occurrences of purines and that of pyrimidines in the obtained string are 
counted. If in the new string there are more occurrences of pyrimidines than 
that of purines, then each letter in the string is replaced by its complementary 
letter and the derivation continues from this string, otherwise the derivation 
continues in the usual manner. Thus, in this case the trigger is defined through 
the number of occurrences of purines and pyrimidines in the string. 

Watson-Crick DOL systems have been studied in details during the years. 
The interested reader can find further information on the computational power 
and different properties of these systems in [1, 16-18, 12-15,7]. 

Another research direction was initiated in [3] where networks of Watson- 
Crick DOL systems {NWDOL systems) were introduced and their behaviour 
was studied. A network of Watson-Crick DOL systems (an NWDOL system) is a 
finite set of WDOL systems over a common DNA-like alphabet which act on their 
own strings in parallel and after each derivation step communicate copies some 
of the generated words to the other nodes. The condition for communication is 
determined by the trigger for turning to the complement. 
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In [3] NWDOL systems with two main variants of protocols were studied: 
in the first case (protocol (a)), after a parallel rewriting step the nodes keep 
the good strings and the corrected strings (complements of the bad strings) and 
communicate a copy of each good string they obtained to each other node. In 
the second case (protocol (b)), the nodes, again, keep both the good and the 
corrected strings but communicate the copies of the corrected strings. The two 
protocols realize diferent philosophies: in the first case the nodes inform each 
other about their correct activities, in the second case they give information on 
the correction of their failures. 

The research was continued in [4], where three results were established about 
the power of so-called standard networks of Watson-Crick DOL systems (or 
NSW DOL systems). Two of them show how it is possible to solve in linear 
time well-known NP-complete problems, namely, the Hamiltonian Path Prob- 
lem and the Satisfiability Problem. The third one shows how in the very simple 
case of four-letter DNA alphabets we can obtain weird (not even Z-rational) 
patterns of the population growth of the strings in the network. 

Network architectures are in the focus of interest in present computer science. 
One of the main areas of investigations is to study how powerful computational 
tools can be obtained by using networks of simple computing devices functioning 
with simple communication protocols. In [2] it was shown that any recursively 
enumerable language can be obtained as the language of an extended NSW DOL 
system using protocol (a). The language of an extended NSW DOL system is the 
set of words which are over a special sub-alphabet of the system (the terminal 
alphabet) and which appear at a dedicated node, the master node, at a derivation 
step during the functioning of the system. 

In this paper we deal with networks of standard W DOL systems with a cer- 
tain type of incomplete information communication. We study the computational 
power of NSW DOL systems where the node sends to each other node a good 
non-empty prefix of every good non-empty word obtained by parallel rewriting 
(this can be the whole good word itself) and keeps the obtained good words and 
the complements of the bad words. A node is allowed to send different prefixes 
of the same string to different nodes, but it is allowed to send only one prefix of 
the string to a certain node. It also might happen that the same word is com- 
municated to a node as the chosen prefix of two different words and/or from two 
different nodes, but after the communication a communicated good string will 
be present at the destination node always only in one copy. 

We prove that in this case extended networks of standard Watson-Crick DOL 
systems form a class of computationally complete devices, i.e. any recursively 
enumerable language can be obtained by these constructs. An analogous state- 
ment can be given for the case where the communicated strings are good non- 
empty suffixes of the string. 
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2 Preliminaries and Basic Notions 

Throughout the paper we assume that the reader is familiar with the basic 
notions of formal language theory. For further details and unexplained notions 
consult [6], [10], and [11]. 

The set of non-empty words over an alphabet S is denoted by if the 
empty string, A, is included, then we use notation S* . A set of strings L C S* is 
said to be a language over alphabet S. For a string w G L and for a set C/ C A, 
we denote by |w|[/ the number of occurrences of letters of [/ in rc. 

A string u is said to be a prefix (a suffix) of a string w € S* if w = uz 
(w = zu) holds for u, z e A*; rt is called a proper prefix (a proper suffix) if 
u w and m yf A holds. In the sequel, we shall denote by pref{w) and suf{w) 
the set of prefixes and the set of suffixes of a string w, respectively. 

Now we recall the basic notions concerning standard Watson-Crick DOL sys- 
tems [8]. 

By a DNA-like alphabet we mean an alphabet S with 2n letters, n > 1, 
where E is of the form E = {ai, . . . , a„, oi, . . . , a„}. Letters Oi and Hi, 1 < i < n, 
are said to be complementary letters. E\ = {ai, . . . ,a„} is said to be the sub- 
alphabet of purines of E and E 2 = {ai,...,a„} is called the sub-alphabet of 
pyrimidines. 

A string w & E* is said to be good (or correct) if > |w|i ;2 holds, 

otherwise the string is called bad (or not correct). The empty word is a good 
word. 

We denote by hw the letter to letter endomorphism of a DNA-like alphabet 
E mapping each letter to its complementary letter. 

A standard Watson-Crick DOL system (an SWDOL system, for short) is a 
triple H = {E,P,wq), where A is a DNA-like alphabet, the alphabet of the 
system, P is a set of pure context-free rules over E, the set of rewriting rules of 
the system, and wg is a non-empty good (correct) word over E, the axiom of Lf. 
Furthermore, P is complete and deterministic, that is, P has for each letter b in 
E exactly one rule of the form b ^ u, with u G E* . 

The direct derivation step in H is defined as follows: for two strings x,y G E* 
we say that x directly derives y in H, denoted by x y, if a: = . . .Xm, 

y = zi... Zm, m > 1, and Zi = yt ii yi . . . ym is a good word and Zi = h^iyi) 
otherwise, where Xi yi G P, 1 < i < m. The empty word. A, derives directly 
itself. The parallel rewriting of each Xi onto yi, 1 < i < m, is denoted by 

Xi . . . Xm ^ P yi • ■ • Vm- 

Thus, if after applying a parallel rewriting to the string the obtained new 
string has less occurrences of purines than that of pyrimidines, then the new 
string must turn to its complement and the derivation continues from this com- 
plementary word, otherwise the derivation continues in the usual manner. 

Now we recall the basic notions concerning networks of standard Watson- 
Crick DOL systems [3,2]. 

By a network of standard Watson-Crick DOL systems (an NSW DOL system, 
for short) with m components, where m > 1, we mean an m -I- 1-tuple 
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r = (E, (Pi, Wi), . . . , {Pm,Wm)), 



where 

— is a DNA-like alphabet, the alphabet of the system, 

— Pi is a complete deterministic set of pure context-free rules over E, the set 

of rules of the f-th component (or the f-th node) of P, 1 < f < m, and 

— Wi is a good (correct) non-empty word over E, the axiom of the z-th com- 
ponent, 1 < Z < TO. 

The first component, (Pi,zci), is said to be the master node. (We note that, 
for our convenience, any other node can be distinguished as the master node, 
this does not mean any change in the meaning of the definition). 

NSW DOL systems function by changing their states according to parallel 
derivation steps performed in the WDOL manner and a communication protocol. 

By a state of an NSW DOL system P = {E, (Pi, zci), . . . , {Pm,Wm)), to > 1, 
we mean an m-tuple (Pi,...,Pm), where Li is a set of good words over A, 
1 < z < TO. 

The initial state of the system is ({zci}, . . . {zcm}). 

Modifying the notion of protocol (a), introduced in [3], we define the func- 
tioning of an N SW DOL sytem which uses communication protocol (x, a), where 
X G {pref, suf}. In this case, after the parallel rewriting step, the node sends a 
good non-empty prefix (a good non-empty suffix) of every obtained good non- 
empty string to each other node and keeps the obtained good words and the 
complements of the generated bad words. Notice that the communicated word 
can be the good word itself. Observe that a node is allowed to send different 
prefixes of the same string to different nodes, but it is allowed to send only one 
prefix of the string to a certain node. It also might happen that the same word 
is communicated to a node as the chosen prefix of two different words and/or 
from two different nodes, but after communication a communicated good string 
will be present at the destination node always only in one copy. 

Let P = (P, (Pi, zci), . . . , (Pm, zcm)), TO > 1, be an NSW DOL system and 
let Si = (Pi, . . . , Pm) and S 2 = (P'l, . . . , P(„) be two states of P. 

We say that si directly derives S 2 according to protocol (x,a), where x € 

{pref , suf }, written as si =^r S 2 , if the following condition holds: for each z, 
1 < z < TO, 

771 

L[ = W,UB[ IJ C), 

where 

A'^ = [z \ z = hn,(y),x y,x € P*, z/ is a bad string}, 

Bl = {y \x y,x e Pj, y is a good string}, 

and Cj is a set of elements obtained from the elements of 



B'j = {vj, } = {y\x 



p^ y,x € Lj, is a good string} 
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as follows: 



where G for x G {pref,suf}, v'j^ is a non-empty good word and 

Vjk ^ 1 < k < Tj . If Bj is the empty set, then C' is empty as well. 

The transitive and reflexive closure of r is denoted by ■ 

The language of an NSWDOL system F using protocol {x, a) for x G {pref, 
suf} r = {S, (Pi, Wi), . . . , (Pm.Wm)), TO > 1, is 



(x,a) 

~ {^1 G Tl I ({zCi},..., ) y p (Pi , . . . , Pm)}- 

That is, the language of P is the set of strings which appear at the master 
node at some derivation step of the functioning of the system, including the 
axiom. 

By an extended NSWDOL system (an ENSWDOL system, for short) we 
mean an to -I- 2-tuple P = (i7, T, (Pi, wi), . . . , {Pm,Wm)), to > 1, where TCP 
and all other components of P are defined in the same way as in the case of 
NSWDOL systems. 

The language of an extended NSWDOL system P using protocol {x,a) for 
x G {pref, suf} is defined by 



L(x,a){r) = {ui G (T* n Pi) I ({wij, . . . , {Wm}) r (Li, . . . , Pm)}. 

3 Computational Power of ENSWDOL Systems 

In the following we show that any recursively enumerable language can be ob- 
tained as the language of an extended NSWDOL system using communication 
protocol {pref, a). Since the language of any extended NSWDOL system is a 
recursively enumerable language, the statement implies that ENSWDOL sys- 
tems are as powerful as Turing machines. Analogous statement can be given for 
the case of protocol {suf, a), by modifying the proof of the above statement. 
The idea of the proof is to simulate the generation of the words of the recur- 
sively enumerable language of an Extended Post Correspondence (EPC) by an 
ENSW DOL system. 

Let T = {ai,...,a„} be an alphabet, where n > 1. An Extended Post 
Correspondence (an EPC, for short) is a pair P = ({(mi,z;i), . . . , (w^jUr)}, (zai, 
. . -,ZaJ), where Uj,Vj,Za, G {0, 1}*, 1 < j < r, 1 < f < n. 

The language represented by P in T, written as L{P), is 

L{P) = {a;i . . . Xm G T* \ there are indices si, . . . , St G {!,..., r}, f > 1, 
such that Msi . . . Ust =Vs^--. Vg^Zx^ . . . z^,^}. 

It is known that for each recursively enumerable language P there exists an 
Extended Post Correspondence P such that P = P(P) [5]. 
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We note that the above definition remains correct and the statement remains 
true if we suppose that the words Ui,Vi, Za^ are given over {1,2} instead of (0, 1}. 
We shall use this observation to make the construction simpler, so in the sequel 
we shall consider this version of the EPC and the above statement. Thus, we can 
consider the words . . . Us^ and ■ ■ ■ Vg^Zx^ ■ ■ ■ Zx^ as numbers in the base 
three notation and therefore we can speak about their values. 

According to the above theorem, a word w = x\ . . . Xm, Xi & T, 1 < i < m, 
is in L if and only if there exist indices si, . . . , St G {1, . . . , r| such that the two 
words Usj . . . Usj and . . . Vg^Zx^^ ■ ■ ■ Zx^ have the same value as numbers in the 
base three notation. 

It is easy to see that we can determine the words of L as follows: We start 
the generation with a string of the form Ug^Vg^, si € {l,...,r|. Then we add 
u-s and v-s to the string in the correct manner to obtain a string of the form a/3 
with a = Ug^ . . . Ug^ and (3 = Vg^ . . .Vg^, for t > 1. Then, in the second phase of 
the generation we add x-s and z-s to the string in a correct manner to obtain 
x\ . . . XraUg^ . . . Ug^Vg-i . . . Vg^Zx-i ■ ■ ■ Zx^- In the final phase we check whether a = 
Ug-i . . . Ug^ and (3' = Vg^ .. . Vg^Zx-i ■ ■ ■ Zx^ are equal or not, and if they are equal, 
then we eliminate both substrings from the string. If the empty word is in L, then 
after the first phase of the above procedure, we continue with the final generation 
phase. The reader can observe that the words of L can also be obtained if in 
the previous procedure we represent a, j3, and j3' with strings with exactly as 
many occurrences of a certain letter, say, A and B, respectively, as the value of 
a, /3, and P', respectively, according to the base three notation. Thus, we can 
simulate the appending of a pair (uj,Vj) or {ai^ZaP to the string in generation 
by modifying the number of occurrences of letters A and B in the word. This 
observation will be used in our construction. 

We shall use the following notation in the sequel: for a word m G (1, 2}*, we 
denote by val{u) the value of m as a number in the base three notation and by 
dig{u) the length of u (the number of digits in u). 

Theorem 1. For every recursively enumerable language L there exists an 
ENSW DOL system F such that L(p,re/,a)A) = L. 

Proof. Let L be a recursively enumerable language with L C T*, where T = 
{tti, . . . , a„|, n > 1, and let L be represented by an EPC 

P = (|(mi, Wi), . . . , {Ur, Wr)}, (^ai , ■ • ■ , ZaJ), 

where Uj,Vj,Zai G {1,2}*, 1 < j < r, 1 < / < n. We construct an ENSW DOL 
system F such that L(^pxef,a){r) = L{P) and T, functioning with protocol 
{pref, a), simulates the generation of words of L according to P. 

For each pair {uj,Vj), 3 < j < r, and for each pair (aijZaJ, 1 < t < n, 
P will have a dedicated node which simulates the effect of appending the pair 
to the string in generation in a correct manner. Furthermore, F will also have 
a node dedicated for deciding whether or not the two substrings representing 
the auxiliary substrings a and /3' (see the short explanation before the theorem) 
are equal. The nodes of P will also able to check whether or not a string of 
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a certain form which arrives at the node from another node is a good proper 
prefix of the word that served as the source of the communication at the other 
node. If a string of this form is a good proper prefix of the original string to be 
communicated, then in the course of the further derivation steps this string will 
have an occurrence of the trap symbol at the first position. Both the trap symbol, 
F, and its complementary symbol, F ^ cannot be cancelled from a string at any 
node. Thus, neither the string with the trap symbol, nor any word originating 
from this string (by rewriting at any node or by communication to any node) 
can take part in a derivation of a terminal word. For the sake of easier reading, 
we also use the short term ” the node for the pair {u,v) or (a, Za)” in the sequel 
instead of the long version ’’the node dedicated for simulating the effect of adding 
the pair (u, v) or (a, Za) to the string in generation.” 

Now we define F. To help the legibility, we provide the reader only with the 
necessary details. 

Let 



r = {F,T, {Pe,We), 

(Pi Ui ,Vi) J (ui ^Vi)'} J • ■ ■ ? ,?^r) ’ jt'r ) ) ’ 

where n and r are given by EPC P. 

Let 

S = {X, X \ X & Bp, $p,B) $o,pj ^i,p}i 1 ^ ^ 3}}U 

{X,^ I X e {Yj,Aij, Bij, $0,1, i) $i,i,j}) 1 < j < ''’jU 

{X, X I X G {Zi, A2,i, $2,A,i, $0,2,0 $l,2,i}, 1 < * < b}U 

{X^,JCi I X G {a, b, c, d,f},l<i< n}U 
I X G {Ai,Bi,A5,B5,Z,E,F}}. 

We note that F is the so-called trap symbol, and each node contains the 
rule F ^ F and F ^ F. The axioms are defined as follows: = E, for 

^ ^ j Y r, W(^ai,za ) = E, for 1 < z < n, and We = E. The master node is {Pe,We)- 
In the following we define the rule sets of the nodes, with some explanations 
concerning their functioning. 

The rule set P(^uj,vj) of the node dedicated for simulating the effect of ap- 
pending the pair (uj,Vj , ) 1 < j < r, to the string consists of the following rules: 









Bi,jBij, 


$1,A — 


$l,A,i$l,A,j) 


$1,B - 


$l,B,j, 


$0,1 “ 


k(j$o,i,j$o,i,j, 


$1,1 ^ 


■ $i,i,i$i,i,i) 


Xi,j - 




Bi,j - 




$i,^,i 




$i,B,i 


^ ^0,1, j 


^ $0,1) 


$i,i.i ■ 


$1,1- 



P(uj,vj) also contains Yj F, and X ^ X, for X G {Yj, Aij, Bij, Si^aj, 

S'l i.j}, and X ^ F for any other letter X of E different from E and the letters 
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with the above listed rules. The node also has the rule 

where kj is equal to the value of Uj , and Ij is equal to the value of vj . 

We give some explanations to the functioning of this node. The rules of the 
node are constructed in such way that a string which appears at the node under 
the functioning of the system can lead to a terminal word of F only if it represents 
af3 in the first phase of the generation of a word in L{P) according to EPC P. 
The strings which are of different forms either already have an occurrence of the 
trap symbol at the first position or will obtain it in the course of the following 
derivation (rewriting steps and communication). Then neither this string with 
F or any other string which originates from it can lead to a terminal word in F. 

Suppose that a string found at this node is a non-empty prefix v' of a string 
of the form 

where k\ is equal to the value of ...Ust and h is equal to the value of 
Vs-i ■ . ■ Vst, Si, . . . ,St G {1, . . . , r}, for some t > 1. Then, by applying the rules of 
the node, the obtained string will be the corresponding prefix v'" of the string 

This string, v'” , is a good string if and only if v' contains $i,b, that is, either 
V was communicated from another node, or v was obtained at this node by the 
previous parallel rewriting step. Otherwise, v'" turns to its complement and in 
the next derivation step the new string will obtain an occurrence of the trap 
symbol at the first position. (Observe that v” represents a string of the form a/3, 
see the explanation before the theorem.) Then neither this new string with F, nor 
any other string which originates from this word can take part in the derivation 
of a terminal word of F in the further steps of the derivation. Similarly, if v" is 
a good string and it is communicated to another node, then it will change for a 
string of trap symbols at the destination node. (Notice that in this case v" has 
only one good non-empty prefix, namely itself.) The same holds for the strings of 
the form like v” which arrive from another node {Pi^uh,vh)^W(uh,vh))^ with h ^ j, 
I < h < r. We shall see below, that strings arriving from node (Pe,We) or from 
a node (T’(ai,za )> ^(oi.za ))> for 1 < f < n, either already have an occurrence of 
F at the first position or will be rewritten onto a string with the trap symbol at 
the first position in the next derivation step. Thus, neither they, nor any word 
originating from these strings can take part in the generation of a terminal word 
of F in the course of the further derivation steps. Thus, suppose that v'” = v” . 
Then, by applying the rules of the node to v' , we obtain the string 

where ^2 = and I 2 = +val{vj). Thus, the rewriting 

simulates the effect of appending the pair (uj,Vj) to the string . . . Us^Vs,^ ■ • ■ I'st 
in the correct manner, to represent . . . Us^UjVs,^ . . . Vs^Vj. 

The rule set P(ai,za.) of th® node dedicated for simulating the effect of ap- 
pending the pair (oi,Zai), 1 < f < n, to the string contains the following rules: 
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A2, 1 ^ 2 , 1 , 


Bi ^ 


B2,iB2A, 


$i,A — 


$2,A,i$2,A,i, 


$1,B - 


^2,B,i, 


$0,1 “ 


Ei$0,2,i$0,2,i, 


$1,1 ^ 


• $l,2,i$l,2,i, 


A2 ^ 


A2, 1 ^ 2 , 1 , 


B2^ 


B2,iB2,i, 


$2,4 — 


$2,A,i$2,A,i, 


$2,B - 


^2,B,i, 


$0,2 - 


Ei$0,2,i$0,2,it 


$12 ^ 


• $l,2,i$l,2,i, 


A 2 ,i - 


A2, 


B2,i - 




$2,A,i ' 


$2, A, 


S 2 ,B,i 


j-jVal(Za 

i^2 


'^$2,B, $0,i,2 


^ $0,2, 


$14,2 ■ 


^i$l,2- 



Moreover, it also contains productions bh bh,ibh,i, bh,i bh, bh,i A for 
1 < h < n, and Zi ^ F, X ^ X for X € ^ 2 ,i, $ 2 ,A,i, $o, 2 .i, $o, 2 ,i} and 

X ^ F for any other letter X of E different from the letters with productions 
listed above. Letter bi represents ai G T, 1 < i < n. Let Tt = {bi \ 1 < i < n}. 

Again, we give some explanations to the functioning of this node. Analogously 
to the previous case, the productions of this node are constructed in such way 
that a string which appears at this node can lead to a terminal word in F only 
if it represents either a/3 or ua(3' in the first, respectively in the second phase 
of the generation of a word in L{P). The strings which are of different forms 
either already have an occurrence of the trap symbol at the first position or will 
obtain it in the course of the following derivation (rewriting and communication) 
and neither the string with F nor any other string originating from it can lead 
to a terminal word in F. Suppose that a string found at this node is a good 
non-empty prefix v' of a string v of the form 

V = $0,pU'$i^pAp^$p^ABp$p^B, 

where u' € T^, p G {1,2}, and u' = X for p = 1. String u' is obtained from 
u G T* hy replacing any occurrence of in u with bi, for 1 < i < n. 

Furthermore, k\ is equal to the value of . . . Us^ and li is equal to the value 
of Wsi . . . VstZu, where . . . ,St G (1, . . . , r|, t > 1, and is the sequence of z-s 
corresponding to u for u ^ X and = A for m = A. Then, similarly to the case 
of the nodes P(u-,v)t 1 < J < we can show that in two derivation steps we 
obtain from v' either the string of the form 



^0,2u' bi%i^2A2^$2,AB^2 ^2,B , 

where k 2 = k\ and h = h ■ -I- val(zai), or a string is obtained which has 

an occurrence of F at the first position. Then neither this latter string, nor any 
string originating from this one (by rewriting or communication) can lead to a 
terminal word of F. Indeed, if v' is a proper prefix of v, then v' will derive in two 
derivation steps a string with the trap symbol at the first position, since if % 2 ,b 
and thus $ 2 ,s,i does not occur in the string, symbol Zi turns to its complement 
and then Zi is rewritten to F. Thus, starting from v, we can simulate the effect 




Networks of Standard Watson-Crick DOL Systems 



45 



of appending the pair (oj, ZaJ to the string uus^ ■ . ■ Us^Vsi^ ■ ■ ■ obtaining a 

string which represents uaiUg^ ■ ■ ■ Us^Vs-i ■ . ■ Vs^ZuZai- 

As in the case of node P(uj,vj),wu- v)> string which arrives from another 
node and is not of the form v, above, either already has an occurrence of the 
trap symbol at the first position or will obtain it in the course of the following 
derivation and then neither the string nor any string originating from this string 
will take part in a derivation in P which leads to a terminal word. 

Finally, we list the rules in the rule set Pg of the node dedicated for decid- 
ing whether the generated string satisfies EPC P or not, that is, whether the 
corresponding two strings, a and j3' , mentioned in the explanation before the 
theorem, are equal or not. This is done by using the possibility of turning to the 
complement. To help the reader in understanding how the decision is done, we 
list the rules together with a derivation. 

We note that analogously to the case of the other nodes, the rules of Pg are de- 
fined in such way that only those strings appearing at this node lead to a terminal 
word which represent strings of the form x\ . . . XmUsi ■ ■ ■ Ust^si ■ ■ ■ VstZxi ■ ■ ■ Zx^, 
with Xi G T, 1 < i < m, where a = f3' for a = Us^ ■ . . Us^ and (3' = ■■ ■ Vg^Zx^- 

. . . ■ Zx^- (See the explanation before the theorem). 

Let 

be a string at node {Pg, Wg), where u' G and p G {1, 2}. We note that u' = X 
for p = 1. Then, at the first step, with rules 



Ap — ^ 


> ^ 3^137 Bp — 


> B 3 B 3 , 




S3,aS3,A: 


Sp,B ■ 


^3,B, So,p - 


Z$0, 3^0,37 


Sl,p “ 


Si,3$i,3, 



and bi CiCi, for 1 < z < n, the string is rewritten to 

Z%0,3^0,3u”%i^3%l^3{A3A3)^%:}^A^3,A{B3B3y^3,B, 

where u” = A for zz' = A (and p = 1, above) and u” = . . . Ci^Ci^ for 

u' = bi^ . . . bi^, with bi^ G T;,, for 1 < j < m. 

Then, either the string is a good string and then the generation continues 
with this string, otherwise the string turns to its complement, obtaining letter 
Z at the first position. 

The rule set Pg also contains rules Z F, Ci di, TXi ^ di, where 
1 < i < n, A 3 ^ A^, ^ B 4 , A^ ^ F, B^ ^ F, and X ^ X for X G 

{Z, A 3 , B 3 , $ 3 , A, $ 3 , A, S 3 .B, $ 3 ,B, $ 1 . 3 , Si, 3 , S 0 . 3 , $ 0 . 3 }- Thus, in the next derivation 
step either a string with F at the first position or a string of the form 

^ . 

V — di^di^ . . . B 4 

is obtained. The string with F at the first position and any other string origi- 
nating from it will never lead to a terminal word. Suppose that the derivation 
continues with v'X 
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Then, the derivation at (Pe,We) will lead to a string over T only ii k < I, 
otherwise the string turns to its complement and at the next derivation step 
occurrences of the trap symbol F will be introduced, and thus neither the string 
nor any other string originating from it can lead to a terminal word. 

Suppose that the derivation leads to a terminal string at node (Pe, We)- Then, 
having productions di fi, di ^ fi, 1 < i < n, A4 ^ A5, B4 P5, A^ F, 
F in Pg, we obtain a string of the form 

v'" = 

Again, the derivation will lead to a terminal string at this node only iik>l^ 
otherwise, at the next derivation step occurrences of the trap symbol will be 
introduced. 

Suppose that a derivation to a terminal word continues at node {Pe,We)- 
Having rules and A5 — > A, P5 ^ A we 

obtain string . . . ai^ . For any other letter X in A 7 , not listed with productions 
above, the node has the rule X ^ F. 

Notice that the derivation results in the empty word if and only if A G L{P) 
holds. 

Now we should prove that F derives all words of L but not more. 

Suppose that xi...Xm G L, Xi G T, 1 < i < m, that is, there are in- 
dices si, . . . ,St G { 1 , . . . , r} such that Us^ ■ . . Us^ = Vs-^ . ■ ■ Vs^Zx^ ■ ■ ■ Zx^ holds. 
Then xi . . . Xm can be obtained in F as follows: First E, the axiom of the 
node for simulating the effect of adding the pair the axiom of node 

for for short, is rewritten to the string representing in the 

coded form, and then, by communication the string is forwarded to the node for 
(us2 , Vs2 ) ■ Then, the communicated string is rewritten in two derivation steps 
at this node and it is forwarded to the next node for (u, v) in the order. We 
continue this procedure while the string representing Us^ ■ . . Us^Vs-^ ■ ■ - Vs^ is gen- 
erated at node for (ustVst). Then, the string is communicated to the node for 
(xijZajj), where it is rewritten in two derivation steps and then it is communi- 
cated to the next node in the order, a node for some pair (x,Zx)- Continuing 
this procedure, we finish this part of the generation at node for (xm,Zx^) with 
a string representing xi . . . XmUs^ ■ ■ ■ Ug^Vs^ . . . Vg^Zx^ ■ ■ ■ Zx^- Then the string is 
forwarded to node (Pe, We), where in some steps its substring representing ajd' = 
Ug-i . . . Us^Vg-^^ . . . Vg^Zx-i . . . Zx^ is eliminated and the corresponding letters from 
T are introduced. Thus, xi...Xm is an element of L{P). The procedure for 
computing A G L{P), if A G L{P), is analogous. 

We should prove that P does not generate a word not in L. By the definition 
of the rule sets of the nodes, we can see that for each string generated at the 
node or communicated to the node, the node for the pair {uj,Vj), 1 < J < r, 
either produces a new string representing a word of one of the forms UjVj or 
Ug-i . . . Ug^UjVg^^ . . .Vg^Vj, Si . . . ,St G {!,..., r}, t > 1, or it produces a new string 
which contains the trap symbol F at the first position which does not make 
possible to generate a terminal word. Then this string and any other string orig- 
inating from this one is irrelevant from the point of view of generation of terminal 
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words of r. Analogously, for each string generated at the node or communicated 
to the node, it holds that the node for (ai,Zai), 1 < i < n, either produces a 
string representing a string of the form uaiUs^ ■ ■ ■ Us^Vs^ ■ . ■ VstZuZat, u G T* , Zu 
is the sequence of z-s which corresponds to u, or it generates a string with an 
occurrence of the trap symbol, F. The latter case leads to strings irrelevant from 
the point of view of generation of words of F. But, only those strings have no oc- 
currence of the trap symbol at the first position at the above two types of nodes 
which represent strings that correspond to the respective generation phases of 
words of L according to EPC P. Similarly to the above cases, the master node, 
(Pe,We), either produces a terminal string (or the empty word) from a string it 
has generated or it received by communication, or the node generates a string 
with an occurrence of the trap symbol. Thus, any terminal word (including the 
empty word) which can be generated by F can be generated according to P but 
not more. Hence the result. 

By standard techniques it can be shown that any language of an extended 
NSW DOL system is a recursively enumerable language. Thus we can state the 
following theorem: 

Theorem 2. The class of languages of E NSW DOL systems is equal to the class 
of recursively enumerable languages. 

Modifying the proof of Theorem 1, an analogous statement can be given for 
the case of communication protocol {suf, a). The idea of the proof is to change 
the role of the endmarker symbols $o,p and $p,s, for p = 1,2, 3, in the procedure 
of checking whether the communicated string is a proper subword of the original 
string or not. We give this statement without the proof, the details are left to 
the reader. 

Theorem 3. For every recursively enumerable language L there exists an 
ENSWDOL system F such that = L. 

4 Final Remarks 

In this paper we examined the computational power of ENSWDOL systems 
with a certain type of incomplete information communication, namely where 
the nodes communicate good non-empty prefixes (suffixes) of the good strings 
they obtained by rewriting. It is an interesting open question how large compu- 
tational power can be obtained if some other way of incomplete communication 
is chosen. For example, it would be interesting to study the case where the node 
communicates an arbitrary non-empty good subword of the good words obtained 
by parallel rewriting or the case where the node splits a copy of the word to be 
communicated into as many pieces as the number of the other nodes in the net- 
work and these splitted subwords are distributed among the different nodes. We 
plan to return to these topics in the future. 
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Abstract. Probabilistic cooperating distributed grammar systems in- 
troduced in [1] are systems of probabilistic grammars in the sense of 
[9], i.e., a probability is associated with any transition from one rule to 
another rule and with any transition from one probabilistic grammar 
to another probabilistic grammar; a probabilistic grammar stops, if the 
chosen rule cannot be applied; and the generated language contains only 
words where the product of the transitions is larger than a certain cut- 
point). We study the families obtained with cut-point 0 by restricting 
the number of rules in a probabilistic component. We show that at most 
two productions in any component are sufficient to generate any recur- 
sively enumerable language. If one restricts to probabilistic components 
with one production in any component, then one obtains the family of 
deterministic ETOL systems. 



1 Introduction 

Cooperating distributed grammar systems have been introduced in [3] as a for- 
mal language theoretic approach to the blackboard architecture known from the 
distributed problem solving. Essentially, such a system consists of some context- 
free grammars (called the components) which work on a common sentential form 
and where the conditions for a grammar to start and/or to stop are prescribed 
in a protocol or derivation mode. The most investigated derivation mode is the 
so-called t-mode, where a grammar has to work as long as it can apply some of its 
productions, and if a component has finished its derivation, then any other en- 
abled component can start. It has been shown in [3] that cooperating distributed 
grammar system have the same generative power as ETOL systems known from 

* This research was supported in part under grant no. D-35/2000 and HUN009/00 
by the Intergovernmental S&T Gooperation Programme of the Office of Research 
and Development Division of the Hungarian Ministry of Education and its German 
partner, the Federal Ministry of Education and Research (BMBF). 
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the theory of developmental or Lindenmayer systems (see [7] and [8]). Further- 
more, in [6] it has been proved that any ETOL language can be generated by a 
cooperating distributed grammar system where any component has at most five 
productions. 

Obviously, instead of context-free grammars one can also use another type 
of grammars as basic grammars. For instance, in [3] and [11] some variants of 
Lindenmayer systems have been taken as basic grammars. In [1], probabilistic 
grammars introduced by A. Salomaa in [9] have been used as components, i.e., 
with any transition from one rule to another rule a probability is associated and 
the generated language contains only words where the product of the transition 
is larger than a certain cut-point. Moreover, in the case of grammar systems a 
probability is associated with a transition from one probabilistic grammar to 
another one, and in the t-mode of derivation a probabilistic grammar stops, if 
after the application of a rule one chooses a rule which cannot be applied. In 
[1] it has been shown that any language generated by a probabilistic cooperat- 
ing distributed grammar system with cut-point c > 0 is a finite language and 
that any recursively enumerable language can be generated by a probabilistic 
cooperating distributed grammar system with cut-point 0. 

In [2] a theorem analogous to the result of [6] mentioned above has been 
given: For any recursively enumerable language L, there is a probabilistic coop- 
erating distributed grammar system F such that any probabilistic component 
of r contains at most six productions and L is the language generated by F 
with cut-point 0. In this paper we improve this result. We show that at most 
two productions in the probabilistic components are sufficient to generate (with 
cut-point 0) any recursively enumerable language. If one restricts to probabilistic 
cooperating distributed grammar systems with one production in any compo- 
nent, then one obtains the same generative power as the power of deterministic 
ETOL systems. 

2 Definitions 

An n-dimensional vector (ai, U 2 , . . . , a„) is called probabilistic, if 0 < Oi < I for 
1 < i < n and X)"=i ~ cardinality of a (finite) set M is denoted by 

^(M). The set of non-empty words over an alphabet V is denoted by V~^; if the 
empty string, denoted by A, is included, then we use the notation V*. 

We recall the notions of matrix grammar, Indian parallel programmed gram- 
mars and extended deterministic tabled Lindenmayer systems. For further details 
we refer to [5], [10], [8] and [7]. 

A matrix grammar is a construct G = {N, T, S, M, F) where N and T are 
the disjoint alphabets of nonterminals and terminals, respectively, S G N, M = 
{toi, m 2 , . . . TO„} is a finite set of finite sequences of context-free productions, 
i.e., for 1 < t < n, mi = (A*i ^ w*i,Ai 2 ^ Wi 2 , Air^ WirJ with Vi > 1, 
Aij G N and Wij € {N U T)* for 1 < i < n, 1 < j < Vi and E is a finite subset 
of {Aij Wij I l<i<n, l<j< Ti}. The sequences nii, I < i < ri, are called 
matrices. 
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Let X and y be two words of {NUT)*. We say that x directly derives y by an 
application of rrii G M, written as x =^mi y if there are words xi,X 2 ■ ■ ■ , x^+i 
such that X = x\, y = and, for 1 < j < one of the following conditions 

is satisfied: 

i) Xj = x'jAijXj for some x'j,x'j G {N U T)* and Xj+i = x'jWijx" or 

ii) Aij does not occur in Xj, Aij Wij G F and Xj+i = Xj. 

The language L{G) generated by a matrix grammar G = (N,T,S,M,F) 
consists of all words z G T* such that there is derivation 




for s > 1 and some matrices , rrii ^ , • ■ • , rrii^ G M. 

It is well-known (see [5] for a proof) that the family of languages generated 
by matrix grammars coincides with the family F{RE) of recursively enumerable 
languages. 

We say that a matrix grammar G = {N, T, S, M, F) is in normal form if 

— N = NiUN 2 U{S, Z} with NinN 2 = (b, S,Z ^ NiU N 2 , 

— all matrices of M have one of the following forms 

(1) {S x) with X G L{G), |a;| < 1 

(2) {S AX) with Ag Ni,X G N 2 , 

(3) {A^w,X ^Y) with AgNi,wG (iVi U T)*, X,Y G N 2 , X y^Y 

(4) {A^ Z,X ^Y) with Ag Ni, X,Y G N 2 

(5) {A^w,X ^Y) with Ag Ni, w GT*, X G N 2 , aGT, 

— F consists of all rules of the form A ^ Z with A G Ni. 

For X G N 2 , we say that m is an X-matrix, if m contains a rule with right hand 
side X. By na{X) we denote the number of X-matrices. 

Lemma 1. For any recursively enumerable language L, there is a matrix gram- 
mar G in normal form such that L = L{G). 

Proof. The statement is shown in Theorem 1.3.7 of [5] for matrix grammars 
in accurate binary normal form, which is obtained from our normal form by 
deletion of the conditions X ^ Y for matrices of type (3). However, it can be 
seen from the proof in [5] that our additional condition is satisfied. 

An Indian parallel programmed grammar is a construct G = {N, T, S, P) 
where any rule p G P has the form p = (A —> w, a, (p) where A ^ w is a context- 
free production with A G N and w G (NUT)* and a and (p are subsets of P called 
the success field and failure field, respectively. The language L{G) generated by 
G consists of all words z G T* which can be obtained by a derivation of the form 

S — Zq Pi '^P2 -^2 '^P3 ' ■ ■ '^Pn ^‘13 — ^ 
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where n > 1, and for 1 < z < n — 1, one of the following conditions are satisfied: 

i) Pj = (Ai Wi,ai,(pi), 

Zi-i = x\AiX 2 Ai . . . Xr-iAiXr with Xi G {{N U T) \ {A*})* 

Zi = x\WiX 2 Wi . . . Xr-iWiXrj and 
Pi-\-l ^ 
or 

ii) p^ = {Ai Wi,ai,(pi), 

Ai does not occur in Zi_i, 

Zi = Zi-i, and 
Pj+1 G ipi. 

An extended deterministic tabled Lindenmayer system (for short EDTOL sys- 
tem) is an n -I- 3-tuple G = (V, T, hi, / 12 , • ■ • , hr, w), where V is an alphabet, the 
set T (of terminals) is a subset of E, w G V~^ is the axiom and, for 1 < z < r, 
hi : V ^ V* is a morphism. 

For two strings x = X\X 2 ■ ■ - Xn with n > 1, Xi G V for 1 < i < n and y G V* 
we say that x directly derives y, if there is a morphism hj, 1 < j < r, such 
that y = hj{x) = hj{xi)hj{x 2 ) ■ ■ ■ hj{xn). The language L{G) generated by an 
EDTOL system is defined as the set of all words over T which can be obtained 
from zc by a sequence of direct derivation steps, i.e.. 



L{G) = {/zji {hi^ (. . . {ht^ (w)) ■ ■ ■)) \ I < ij < r for 1 < j < s} n T* . 

By L{EDT0L) we denote the family of languages generated by EDTOL sys- 
tems. 

We now define the central concept of this paper, the probabilistic cooperating 
distributed grammar systems. 

A probabilistic cooperating distributed grammar system is a construct 



r = {N, T, S, {Pi,6i,(l>i,(j)[), {P2, 62, h, <(' 2 ). ■ • ■ . {Pn, 5 n, (j>n, 



where 

— N and T are disjoint alphabets of nonterminals and terminals, respectively, 

— S called the axiom is an element of N, 

— zz is a positive integer, 

— for 1 < z < n, 

• Pi = {pii,pi 2 , . . . ,piki} is a finite set of context-free productions (i.e., 
each pij is of the form A ^ w with A G N and w G {N U T)*) where the 
given order of the ki elements of Pi is fixed, 

• is a /ci-dimensional probabilistic vector, whose j-th component gives 
the probability to start a derivation, which uses only rules of Pi, with 
the j-th rule pij of Pi, 

• is a fci-dimensional vector, whose j-th component is a /cz-dimensional 
probabilistic vector (j>i{j) = (<jyi, ■ ■ ■ , 4>ijkt) whose /c-th component 
4>ijk gives the probability that after an application of pij we apply pik 
as the next rule. 
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• is a n-dimensional probabilistic vector, whose j-th component 
gives the probability that after an application of the component Pi we 
continue with the component Pj, 

— (5 is an n-dimensional probabilistic vector, whose f-th component 5{i) gives 
the probability that the derivation starts with the i-th component Pi. 

The constructs {Pi,Si,(j)i,(pi), 1 < i < n, are called the components of P. 
Sometimes we also say that Pi is a component. 

Let 

D : X Xi Xa = y 

be a derivation which only uses rules of Pi. Shortly, we write 

D : X y . 

We say that D is a t-derivation with respect to Pi, if 



i) y is a word over the terminal alphabet T, or 

ii) the production G Pi chosen to be applied in the next step cannot 
be applied. 

With a t-derivation D we associate in case i) or ii) the values 

~ ■ 4’ijl32 ■ 4‘ij2j3 

or 

' 4’ijlj2 ' 4^13233 ' '^ijsjs+1 J 

respectively. 

Let 



D':S 



( 1 ) 



be a derivation such that any subderivation 



^3 ■ ’'Pi^ ^3 

is a t-derivation with respect to Pi^ . With D' we associate the value 



r — 1 r 

v{D') = 5{i) ■ . (zj+i) • Y[ v{Di) 

3 = 1 i=l 

(the first factor gives the probability to start with the component Pi ,, , the second 
factor takes into consideration the transitions from one component to another 
one, whereas the third factor measures the derivations Dj). The language L{P, c) 
with cut-point c consists of all words z G T* which can be obtained by a deriva- 
tion D' of the form (1) such that v{D') > c. 
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Our definitions given above differ slightly from the definitions presented in [2] . 
Especially, we have a more accurate value associated with a derivation. However, 
it is easy to see that both definitions are equivalent. 

In [1] it has been shown that any language generated by a probabilistic coop- 
erating distributed grammar system with cut-point c > 0 is a finite language, and 
that any recursively enumerable language can be generated by a probabilistic 
cooperating distributed grammar system with cut-point 0. The latter statement 
can be seen easily since a probabilistic cooperating distributed grammar system 
with cut-point 0 can be transformed in a programmed grammar where the suc- 
cess field of a rule ptj G Pi consists of all rules ptk G Pi such that (j)ijk > 0 and 
the failure field of pij consists of all rules Prs & Pr, 1 < r < n, with (j)'i{r) > 0 
and (5r(s) > 0. 

In this paper we shall discuss only languages with cut-point 0. Therefore we 
use the notation L{P) instead of L(T, 0). 

By L{PCDrCF) we denote the family of all languages generated by proba- 
bilistic cooperating distributed grammar system P with cut-point 0 where each 
component of P contains at most r productions. 

The following lemma immediately follows from the definitions. 

Lemma 2. For any r > 1, L{PCDrCF) C L{PCDr+iCF). 

3 Results 

We start with an investigation of probabilistic grammar systems where each 
component contains only one rule. 

Lemma 3. For any probabilistic cooperating distributed grammar system F 
where each component contains exactly one rule, there is a Indian parallel pro- 
grammed grammar G such that L{G) = L{F). 

Proof. Letr = {N,T,S,{{Ai ^ wi},Si,(j)i,(j)[), . . . ,{{An ^ w„}, (5„, <5) 

be an arbitrary probabilistic cooperating distributed grammar system where 
each component contains exactly one rule. Then 5i = (j)i{l-) = (1) for 1 < z < n. 

Let us assume that we have to apply {{Ai Wi}, (1), (1), (p'f) to the sentential 
form w. If Ai occurs in w and in Wi, then we have to apply the rule Ai —>■ Wi ad 
infinitum and do not terminate. Thus we do not change the generated language 
if we substitute ({H* ^ wj, (1), (1), (?i') by {{Ai F}, (1), (1), where F 
is an additional symbol (because we are not able to terminate the letter F). 
Therefore, without loss of generality, we can assume that Ai does not occur in 
Wi for 1 < z < n. 

The application of {{A^ zc*}, (1), (1), (/)') to w = xiAiX 2 Ai...XkAiXk+i 

with Xj G {{N U T) \ {Hi})* leads to w' = x\WiX2Wi . . . XkWiXk+i, i.e. we have 
performed a derivation step as in an Indian parallel mode (see condition i) of 
the definition of the derivation step in a Indian parallel programmed grammar) . 
We now construct the Indian parallel programmed grammar 

G= {NU{S'},T,S',{pi,p2,...,Pn} 
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with 



Po = (S' S, {pj I S{j) > 0}, {pj I 6 {j) > 0}) , 

p, = {A, Wi, {pj I > 0}, {pj I (t>i{j) > 0}) for 1 < z < n. 

Any derivation in G starts with an application of po and leads to S to which 
all rules can be applied which correspond to components of F whose start proba- 
bility is greater than 0. Moreover, in the sequel we have w w' by application 
of the component Pi in F if and only if we have w w' by application of pt 
in G. Therefore L{G) = L{F) follows. 

Lemma 4. L{EDTOL) C F{PCDiCF). 

Proof. Let L G L{EDT 0 L). By [8], Chapter V, Theorem 1.3 (since the con- 
struction in its proof gives a EDTOL system, if we start with an EDTOL sys- 
tem, this theorem holds for EDTOL systems, too), there is an EDTOL system 
G = {V,T,{hi,h2},w) (with only two homomorphisms) such that L = L{G). 
Let 

V = {tti, 02, . . . , Om}, V' = {a'i I 1 < z < to} and V" = {a'f | 1 < z < to} . 

Moreover, if zc = 6162 . . . 6„, G E for 1 < z < n, then we set w' = 6} 63 ■ ■ - b'^ and 
w” = bfb'f ■ ■ - b'f. Furthermore, we define the homomorphism h : V ^ T U {E} 
by h{a') = a for a € T and h{a) = F in the remaining cases. 

We now construct the probabilistic CD grammar system 

F = (V'UV"U{S, E}, T, 5, (El, <5i, 0 }), . . . , (E4„^+i, ^4^+1, S) 

where 



<5=(0,0,...,0,1), 

E 4 771-1-1 — (iS* > W }, S4m-\-l — ^4m-t-l(l) — ( 1)5 

"1/2 j€{l, 3 m+l} 



4>4m+l(j) ~ 



0 



otherwise 



Pm — {o-m 
<l^mU) = 



a'm}. Sm = <('m(l) = (1), 

1/2 j G {to -I- 1, 2to -I- 1} 

0 otherwise ’ 

E2777 = {o^ ^ ^1(0.777) }, S 2 m — /’ 2 m(l) — (1)5 



/’2m 0) — 



1/2 jG|l,3TO-hl} 



0 otherwise 

Pzm — i^m ’ bl2{cim} }j ^Sm — /’3m(l) — ( 1)5 

,7 ... Jl/2 jg{1,3to+1} 

<P3mU) = S „ . 

I 0 otherwise 

PAm — i^m ’ ^(^m)}? /4m — /’4m (1) — (1)? 

</}„ = (0,0,...,0,l), 
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and for 1 < z < m — 1, 

Pi = {< 

m = 



a”}, Si = = ( 1 ), 

1 j = i + l 
0 otherwise ’ 

-fm+z — {o-j ^ — 0m+z(l) — (1)? 

, Jl j = 

^ *|0 otherwise ’ 

-f2m+z — {^2 ^ ^ 2 (^ 2 ) }; *^2m+z — 02m+z(l) — ( 1)5 

, , _ J 1 J = 2m + i + 1 

I 0 otherwise 

-fsm+z — ^ ^(^z)}5 ^3m+z — ^3m+z(l) — (1)? 

/ 

1 J=TO + Z+1 






0 otherwise 



By our construction, we have to start with the component P 4 m+i which leads 
to w' and we have to continue with Pi or P^m+i- 

Let us now assume that we have a sentential form v' for some v S V* and 
that we can apply the components Pi or P^m+i- 

Using Psm+i we substitute all occurrences of a'l by h{a'i) and have to continue 
with Psm +2 which corresponds to a substitution of all occurrences of a '2 by h{a' 2 ) 
and so on. After using P 4 m we have replaced all letters of v' and obtain h{v'). 
We get a word containing an F (and the derivation cannot be terminated) or 
the terminal word v. 

Using Pi we replace by the use of the component Pi , P 2 , • • • , Pm in succession 
all occurrences of primed letters by the corresponding two-primed version, i.e. 
we get v" . Moreover, we have to continue with Pm+i or P 2 m-i-i- In the former 
case we apply in succession the components P^+i, Pm+ 2 , ■ ■ m P 2 m and replace 
each letter a" by hi{ai)' . Thus we obtain hi{v)' . In the latter case we obtain 
h 2 {v)' . Therefore we have simulated a derivation step according to the EDTOL 
system G. 

By these remarks it is obvious that L = L{G) = L{P) 

Now we turn to probabilistic grammar systems with at most two rules in a 
component. 

Lemma 5. L{RE) C L{PGD 2 GF). 

Proof. Let P be a recursively enumerable language. By Lemma 1, there is a 
matrix grammar G = {N, T, S, M, F) in normal form such that L = L{G). Let 
us assume that, for 1 < i < 5, there are kt matrices of type (z) in M. We set 

lo = 0 and k = ki + k 2 -\ h for 1 < z < 5. We number the matrices of type 

(z) from li-i + 1 to U- 
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We define the probabilistic cooperating distributed grammar system 



r = (N,T, S', {Ph+I, K+ 1 ^ ^h+i)^ S) 



where the component Pi is associated with the z-th matrix rrii, by 



m 



I/I 2 for 1 < z < I 2 , 

0 for ?2 + 1 < h 



(we start with a component associated with a matrix of type (1) or (2)), 



Ph+i = {S ^ S}, 6i,+i = (1), = (1), <('4+i(j) 



1 for j = ^5 + 1 
0 otherwise 



(if this component has to be applied to a sentential form containing S, then we 
have to replace S S a,d infinitum; if S is not present in the sentential form, 
we have to apply without changes this component again and again, i.e., if we 
have to apply this component, we cannot terminate), 



P, = {S ^ a:}, = (1), = (1), <^'(j) 



1 for j = 1 
0 otherwise 



for 1 < z < and rrii = (S —>■ x) (if we start with a component corresponding 
to a matrix of type (1), we generate a terminal word and stop the derivation), 



Pi = {S 

m = 



AX}, Si 

l/naiX) 

0 



= (1), 0i(l) = (l), 
if rrij is an X-matrix 
otherwise 



for h + 1 < i < I 2 and mi = {S AX) (if we start with a component corre- 
sponding to a matrix of type (2), we generate AX which is a simulation of an 
application of mi and continue with a Wmatrix of type (3), (4) or (5) as in the 
matrix grammar), 

P, = {A^w,X ^Y}, 6i = {l,0), = U‘2) = (0,1), 

,, , , I l/nniY) if m,- is a F-matrix 

= |o otherwise 

for I 2 + I < i < h and mi = {A ^ w,X ^ Y) (if we apply a component 
corresponding to a matrix of type (3) to a word zX, z G (A^i UT)*, we substitute 
one occurrence of e4 in z by zc and the only occurrence of X by P thus simulating 
an application of the matrix and continue with a P-matrix of type (3), (4) or 
(5) as in the matrix grammar; if A does not occur in z, then we immediately 
pass without changing the sentential form to the following component which 
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corresponds to a matrix of (3), (4) or (5), i.e., we have the same situation as 
before the application of Pj), 

P^ = {A^F,X^Y}, < 5 . = ( 0 , 1 ), <).,( 1 ) = ().,( 2 ) = ( 1 , 0 ), 



m 



X/nciX) if Trij is a P-matrix 
0 otherwise 



for ^3 + 1 < i < ?4 and rm = {A ^ F,X ^ Y) (if we apply a component 
corresponding to a matrix of type (4) to a word zX, z G (A^i U T)*, we first 
substitute X hy Y] if is present in z, we replace all occurrences of A by P 
such that the derivation cannot be terminated since there is no rule for P; if A 
is not present in z, we continue with a P-matrix of type (3), (4) or (5) as in 
the matrix grammar; therefore in terminating derivations we have simulated a 
derivation step according to G), 

P, = {A^w,X (5i = (l,0), ().,(1) = .^,(2) = (0,1), 

= = + l 

I 0 otherwise 



for I 4 + 1 < i < and rrii = {A ^ w,X ^ a) (if we apply a component 
corresponding to a matrix of type (5), we again simulate the application of the 
matrix; moreover, we have to terminate since otherwise we have to continue with 
the last component which results in a non-terminating infinite derivation by the 
remark added to this component). 

It is easy to see that all sentential forms are terminal words or of the form 
zX with z G (fVi U T)* . Thus by the above explanations any derivation of P 
simulates a derivation of G. Thus L{P) C L{G). 

Moreover, it is easy to see that any derivation of G can be simulated in P. 
Thus P(G) C P(P), too. Hence P(P) = P(G) = L. 

We now combine our results to obtain a hierarchy with respect to the number 
of rules in the components. We shall obtain a hierarchy with two levels only. 

Theorem 1. For any r > 2, 

F{EDT0L) = F{PCDiGF) c= L{PGDrCF) = L{RE) . 

Proof. By [4], Lemma 4, any language generated by an Indian parallel pro- 
grammed grammar is contained in F{EDTQL). If we combine this result with 
Lemma 3, then we obtain L(PCDiGF) C £j{EDT0L). Together with Lemma 4 
we get E{PGDiCF) = L(PP)TOP). 

By Lemma 5, Lemma 2, and the result from [1] that any probabilistic cooper- 
ating distributed grammar system generates a recursively enumerable language, 
we have 

E{RE) C L{PCD 2 GE) C E{PGDrCF) C E{RE) 

for r > 2. This implies the remaining equalities L{PGDrCF) = L{RE) for 
r >2. 
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Abstract. In this note we investigate the languages obtained by inter- 
secting slender regular or context-free languages with the set of all prim- 
itive words over the common alphabet. We prove that these languages 
are also regular and, respectively, context-free. The statement does not 
hold anymore for either regular or context-free languages. Moreover, the 
set of all non-primitive words of a slender context-free language is still 
context-free. Some possible directions for further research are finally dis- 
cussed. 



1 Introduction 

Combinatorial properties of words and languages play an important role in math- 
ematics and theoretical computer science (algebraic coding, combinatorial theory 
of words, etc.), see, e.g., [2], [5], [8], [16]. 

A word is called primitive if it cannot be expressed as the power of another 
word. There has been conjectured [1] that the set of all primitive words over a 
given alphabet is not context-free. However, this language satisfies different nec- 
essary conditions for context-free languages (see [1] for further details). Hope- 
fully, this conjecture requires new methods based on the structure of context-free 
languages and perhaps will lead to sharper necessary conditions for languages 
to be context-free. 

A language is slender if the number of its words of any length is bounded by 
a constant. It was proved, first in [6], and later, independently, in [13] and [10], 
that slender regular and USL-languages coincide. A similar characterization of 

* This work was supported by grant from Direccion General de Universidades, Secre- 
tarla de Estado de Education y Universidades, Ministerio de Educacion, Cultura y 
Deporte (SAB2001-0081), Espana. 
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slender context-free languages was reported in [7] and later, independently, in 
[4] and [11]. It was showed that every slender context-free language is UPL and 
vice versa, statement conjectured in [10]. 

It is known that the intersection of a regular language with the set of primitive 
words over the common alphabet is not necessarily regular. Since the set of 
all primitive words over an alphabet with at least two letters is not regular, it 
suffices to take the regular language consisting of all words over such an alphabet. 
We prove that if the regular language is slender then the above intersection is 
always regular. It immediately follows that the set of all non-primitive words 
of a slender regular language is regular too. Similar results hold for slender 
context-free languages as well. Furthermore, we prove that, similar to the case 
of regular languages, the set of all primitive words of a context-free language is 
not necessarily context-free. 

This note is organized as follows: in the next section we fix the basic notions 
and notations and recall several results which will be used in later reasonings. 
The third section is dedicated to the sets of all primitive words of slender reg- 
ular languages. The main result of this section states that these languages are 
always regular. As an immediate consequence, the set of all non-primitive words 
of a slender regular language is regular. A similar investigation is done in the 
forth section for slender context-free languages. The obtained results are similar, 
namely the sets of all primitive and non-primitive words of a slender context-free 
language are both context-free. The paper end by a short section dedicated to 
some open problems. 

2 Preliminaries 

We give some basic notions in formal language theory; for all unexplained notions 
the reader is referred to [12]. 

A word (over S) is a finite sequence of elements of some finite non-empty set 
S. We call the set S an alphabet, the elements of S letters. If u and v are words 
over an alphabet S, then their catenation uv is also a word over S. Especially, 
for every word u over S, uX = Xu = u, where A denotes the empty word. Given 
a word u, we define = A, m" = n > 0, u* = |u” : n > 0} and 

= u* \ {A}. 

The length |r/;| of a word w is the number of letters in w, where each letter 
is counted as many times as it occurs. Thus |A| = 0. By the free monoid E* 
generated by E we mean the set of all words (including the empty word A) having 
catenation as multiplication. We set A7+ = A7* \ {A}, where the subsemigroup 
A7+ of A* is said to be the free semigroup generated by E. Subsets of E* are 
referred to as languages over E. 

A primitive word over an alphabet A7 is a nonempty word not of the form 
yj-m nonempty word w G E~^ and integer m > 2. The set of all primitive 

words over E will be denoted by Q(^)> or simply by Q if A7 is understood. Q has 
received special interest: Q and E~^ \ Q play an important role in the algebraic 
theory of codes and formal languages (see [8] and [16]). 
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We denote by card{H) the cardinality of the finite set H. A language L C S* 
is said to be k-slender if card{{w € L : \w\ = n}) < k, for every n > 0. A 
language is slender if it is /c-slender for some positive integer k. A 1-slender 
language is called a thin language. A language L C S* is said to be a union of 
single loops (or, in short, USL) if for some positive integer k and words Ui,Wi G 
S*, Wi G A+, 1 < z < A:, 

fc 

(*) L=[Ju^v*w^. 

A language L C 17* is called a union of paired loops (or UPL, in short) if for 
some positive k and words Ui^Wi^yi G i7*, vi^xi G i7“^, 1 < ^ 

k 

(**) L=y_j{uivfwiX^yi\n>Q}. 

i=l 

For a USL (or UPL) language L the smallest k such that (*) (or (**)) holds is 
referred to as the USL-index (or UPL-index) of L. A USL language L is said to 
be a disjoint union of single loops (DUSL, in short) if the sets in the union (*) 
are pairwise disjoint. In this case the smallest k such that (*) holds and the k 
sets are pairwise disjoint is referred to as the DUSL-index of L. The notions of a 
disjoint union of paired loops (DUPL) and DUPL-index are defined analogously 
considering the relation (**). 

For slender regular languages, we have the following characterization, first 
proved in [6], and later, independently, in [13] and [10] ([14] and [15] are an 
extended abstract and a revised form, respectively, of [13]). 

Theorem 1. For a given language L, the following conditions are equivalent: 
(z) L is regular and slender. 

\ii) L is USL. 

(Hi) L is DUSL. 

Moreover, if L is regular and slender, then the USL- and DUSL-indices of L are 
effectively computable. 

The following result is taken from [10]. 

Theorem 2. Every UPL language is DUPL, slender, linear and unambiguous. 

The next characterization of slender context-free languages was proved in [7] 
and later, independently, in [4] and [11]. It was also conjectured in [10]. 

Theorem 3. Every slender context-free language is UPL. 

Any cyclic permutation of a primitive (non-primitive) word remains primitive 
(non-primitive) as formally stated in [16, 17]. 

Theorem 4. Let z > 1 and uv G {p^ : p G Q}. Then vu G {p^ : p G Q}, too. Ln 
other words, the sets {p* : p G Q} (z > 1) are closed under cyclic permutations 
of words. 
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We shall use the following result from [18]. 

Theorem 5. Let f,g & Q, f ^ g- Then fg’^ & Q or fg"'~^^ € Q for all n>2. 

Let u X and let / be a primitive word with an integer fc > 1 having u = . 

We write = f and call / the primitive root of the word u. The uniqueness of 
primitive root was proved in [9] (see also [16]). 

Theorem 6. If A, then there exists a unique primitive word f and a unique 
integer k>l such that u = f^ . 

The next statement, useful in what follows, is also from [9]. 

Theorem 7. Let f,g & Q, f ^ g- Then f™g^ G Q for all m>2,n> 2. 

The following result reported in [2, 3] will also be applied in the sequel. (For 
a weaker version of this statement see also [16].) 

Theorem 8. Let u and v be two nonempty words, and, p,q > 0 integers. If 
and v^ contain a common prefix or suffix of length |u| + |u| — gcd{\u\, |u|) (where 
gcd{\u\,\v\) denotes the greatest common divisor of |m| and |u|^ then u = w'^ 
and v = w”, for some word w and positive integers m,n. 

Finally, we need one more result taken from [18]. 

Theorem 9. Let p,q G Q,p q. Then card{p'^q'^ \ Q) < 1- 

3 Intersecting Slender Regular Languages with Q 

We start with some preliminary results. First it is easy to note that Theorem 
9 can be extended, in a certain sense, to arbitrary words instead of primitive 
ones. Assume S = {a, b},p =af,q = ba'^b. Then, of course, p,pq f Q. Theorem 
7 implies pq^ G Q,n > 2, hence card{pq'^ \ Q) = 2 In general, we have the 
following result. 

Lemma 1. Let u,v G A+ such that \fu yf \pv. Then card{uv* \Q) <2. 

Proof. By Theorem 9, (•\/m)~''(-\/u)~'' \ Q has at most one element. Therefore, 
uv* \ Q has at most one element if m G Q. Assume u G A+ \ Q and let u = (-\/u)* 
for some s > 1. Then, by Theorem 7, mu" G Q whenever n >2. Therefore, uv* 
has at most two non-primitive words. □ 

Next we prove the following statement. 

Lemma 2. Let u,w G S* and v G A+. 

(i) If uw = X, then uv*w \ Q = {A}. 

(ii If uw yf A and ^/wu yf -y/u, then card{uv*w \Q) <2. 
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Proof. Using Theorem 4, it is enough to prove that card{wuv*\Q) < 2 whenever 
uw, V G U+ such that ^/wu \Jv. But this is a direct consequence of Lemma 1. 

□ 



Now we can state the main result of this section. 

Theorem 10. The family of slender regular languages is closed under intersec- 
tion with the set of all primitive words. 

Proof. Let L be a slender regular language; by Theorem 1 L is a DUSL, hence 
L = UiV*Wi for some positive integer k and words Ui, Vi, Wi, I < i < k, such 
that UiV*Wi n UjVjWj = 0 for all 1 < i ^ j < k. If y/wfui = ydy or UiWi = A 
for some i, then all words in the set Uivfwi are non-primitive. If ^/wful yf ..Jvl, 
then each set UiV*Wi contains at most two non-primitive words. Therefore, 

LnQ = FU [J{uiV^Wi\R^), 

iGl 

where J = {z : I < i < fc, ^Jwiul yf y/vi}, F = {uiWi : 1 < i < k,i ^ I, UiWi G 
Q}, and i?,, i G I are finite sets containing at most two words. By the closure 
properties of regular languages it follows that L n Q is regular. The slenderness 
of L n Q is obvious. □ 

Since the class of regular languages is closed under set difference, by Theorem 
10 we also have: 

Corollary 1. The class of slender regular languages is closed under set differ- 
ence with the language of primitive words. 

4 Intersecting Slender Context-Free Languages with Q 

Now we start a similar investigation to that from the previous section for the 
class of slender context-free languages. Again, we first need some preliminary 
results. 

Lemma 3. Let u,w,y G S* ,v,x G A+. If {k : \J yuv^w = y/x} is a finite set, 
then {uv'^wx'^y : n > 0}\Q is finite as well. 

Proof. Let us first consider the case uwy = A. Clearly, ^/v yf ^/x, otherwise the 
set {k : \J yuv^w = y/x} would be infinite. Then the statement follows from 
Theorem 7. 

Assume now that uwy yf A and let fco be the maximal k such that a/ yuv^w = 
\fx, therefore ^yuv'^w yf \fx for any n > k^, Let n >max(fco, 3). 

If yuv'^w Q, then by Theorem 7 we infer that yuv^wx'^ G Q, hence, by 
Theorem 4, uv’^wx^y G Q. 

If yuv^'w G Q, then by Theorem 5 and the choice of n, yuv^wx'^ G Q holds. 

□ 
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Lemma 4. Let u,w,y € S*,v,x € such that the set {k : a/ yuv^w = \/d} 
is infinite. Then {uv'^wx'^y : n > 1} n Q = 0. 

Proof. Case 1. uwy = A. Then, since {k : \J yuv^w = y/x} is infinite, there exist 
infinitely many A: > 1 with \/v^ = y/x. On the other hand, for every A: > 1, we 
have = -\/x if and only if ^/v = ^/x. But this implies v^x’^ ^ Q,k > 1. 

Case 2. uwy A. First we prove that ^/wyu = -y/u. Indeed, assume ^/wyu fi- 
\/v. If wyu ^ Q, then by Theorem 7, wyuv'^ G Q,n > 2. If wyu G Q, then by 
Theorem 5, wyuv'^ G Q,n > 3. Therefore, by Theorem 4, yuv^w G Q, n > 3. 
But then for every s,t > 3, we obtain y/ywFw = \J yuv*w if and only if s = A. 
Therefore, if \/ yuv^w = \fx then \J yuv^~^^w ^/x, for any i > 1, which implies 
that {A; : a/ yuv^w = yLx} is finite, a contradiction. Thus, we have y/wyu = yCj 
(with yuw yf A). Furthermore, y/ymFw = \J yuv^w, for all s,t> 1. 

On the other hand, since {A: : a/ yuv^w = \/x} is infinite, there exist infinitely 
many A; > 1 with a/ yuv^w = -y/x. Hence, using ^yuv^w = a/ yuv*w for all 
s, A > 1, we obtain a/ yuv^w = \fx for all A: > 1. Thus, we get {uv'^wx'^w: n > 1} 
C]Q = 0 as we stated. □ 

As a consequence, we have the following result similar to Theorem 10 and 
Corollary 1. Note that unlike the family of regular languages, the family of 
context-free languages is not closed under set difference. 

Theorem 11. The class of slender context-free languages is closed under inter- 
section and set difference with the language of primitive words. 

Proof. Let L be a slender context-free language; by Theorems 3 and 2 L is 
a DUPL. Consequently, L = UiV*WiX*yi for some positive integer k and 
words Ui,Vi,Wi,Xi,yi, 1 < i < k, such that UiV*WiX*yi C UjV*WjX*yj = 0 for all 
1 < * 7^ J < fc- 

By Lemma 3, if {p : a/ yiUivfwi = ^/xf] is a finite set, then UiV*WiX*yi 
contains a finite set of non-primitive words. 

By Lemma 4, if {p : yj yiUiV^Wi = ^/xi} is an infinite set, then UiV*WiX*yi 
contains a primitive word only, provided that UiWiPi G Q, or no primitive word, 
otherwise. 

In conclusion 

Ln Q = FU [J{u^v*w^x^yi \ Ri), 

iGl 

where / = {z : 1 < A < A;, {p : y^yiUivfwi = ^/xi} is a finite set }, F = {uiWiyi : 
1 < i < k,i ^ I,UiWiyi G Q}, and Ri, i G I are finite sets. By the closure 
properties of context-free languages it follows that L n Q is context-free. 
Analogously, 

\ Q = (IJ ^ \ F), 

ie/ i^I 

where /, F and Ri are the same sets as above. Obviously, L\Q is also context- 
free. 
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In both cases, the languages are slender since they are sublanguages of a 
slender language. □ 

This result does not hold anymore for arbitrary context-free languages. In- 
deed, let us take the well-known context-free language L = {ww^ : w G {a, 6}'*'}, 
where s the mirror image of the word w. We use the pumping lemma for 
showing that LDQ is not context-free. Clearly, x = a" 6a" o" 6a” lies in LnQ for 
arbitrarily large n. However, any attempt to pump two subwords of x satisfying 
the requirements of pumping lemma leads to a word which cannot be at the 
same time in L and primitive. We can state this as: 

Theorem 12. The family of eontext-free languages is not elosed under inter- 
section with the language of primitive words. 

5 Final Remarks 

We finish this note with a brief discussion on possible directions, which appears 
of interest to us, for further research. There are a lot of subclasses of regular 
and context-free languages: locally-testable, poly-slender, Parikh-slender, dense, 
complete, periodic, quasi-periodic, etc. A natural continuation is to investigate 
which of these classes are closed under the intersection with the language of 
primitive words. Alternatively, in some cases, it appears attractive to study 
when the intersection of languages in these classes and the language of primitive 
words leads to a regular or context-free language. 
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Abstract. We develop an algebraic theory on semiring-semimodule pairs 
for w-context-free languages. We define w-algebraic systems and charac- 
terize their solutions of order k by behaviors of algebraic finite automata. 
These solutions are then set in correspondence to w-context-free lan- 
guages. 



1 Introduction 

The purpose of our paper is to give an algebraic approach independent of any 
alphabets and languages for w-context-free languages. The paper continues the 
research of Esik, Kuich [5-7] and uses again pairs consisting of a semiring and 
a semimodule, where the semiring models a language with finite words and the 
semimodule models a language with w- words. 

The paper consists of this and two more sections. We assume the reader of 
this paper to be familiar with the definitions of Esik, Kuich [5-7]. But to increase 
readibility, we repeat the necessary definitions concerning semiring-semimodule 
pairs and quemirings in this section. In Section 2, w-algebraic systems and 
w-algebraic power series are considered. The solutions of order k of these u>- 
algebraic systems are characterized by behaviors of algebraic finite automata. 
The w-algebraic systems and w-algebraic power series are then connected in Sec- 
tion 3 to w-context-free grammars and w-context-free languages, respectively. 

Suppose that S' is a semiring and U is a commutative monoid written addi- 
tively. We call V a (left) S-semimodule if V is equipped with a (left) action 

SxV 
(s, v) sv 

subject to the following rules: 

s(s'u) = (ss')u, (s + s')v = sv + s'v, s(v + v') = sv + sv', 
lu = V, Ou = 0, sO = 0, 

* Partially supported by Aktion Osterreich-Ungarn, Wissenschafts- und Erziehungsko- 
operation, Projekt 530U1. Additionally, the first author was supported, in part, by 
the National Foundation of Hungary for Scientihc Research, grant T 35163. 
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for all s, s' G S and v,v' G V. When V is an S'-semimodule, we call {S,V) a 
semiring-semimodule pair. 

Suppose that {S, V) is a semiring-semimodule pair such that S' is a starsemir- 
ing and S and V are equipped with an omega operation : S —>■ V. Then we 
call (S, V) a starsemiring-omegasemimodule pair. 

Esik, Kuich [5] define a complete semiring-semimodule pair to be a semiring- 
semimodule pair (S, E) such that S is a complete semiring and E is a complete 
monoid, and an infinite product operation is defined, mapping infinite se- 
quences over S to y. Moreover, the infinite sums and products have to satisfy 
certain conditions assuring that computations with these obey the usual laws. 
Suppose that (S', E) is complete. Then we define 

i>0 i>l 

for all s G S. This turns (S, V) into a starsemiring-omegasemimodule pair. 

Following Bloom, Esik [2] we define a matrix operation ^ ynxi 

on a starsemiring-omegasemimodule pair (S, V) as follows. When n = 0, is 
the unique element of and when n = 1, so that M = (a), for some a G S, 
= (a“). Assume now that n > 1 and decompose M into blocks a, b, c, d with 

a of dimension 1x1 and d of dimension (n — 1) x (n — 1): M = • Then 

_ f {a + bd*cY + (a + bd*c)*bd^ \ 

\ {d+ca*bY + {d+ca*b)*ca^ )' 

Moreover, we define matrix operations ^ ynxi^ 0 < k < n, as 

follows. Assume that M G 5'"^” is decomposed into blocks a,b,c,d with a of 

dimension k x k and d of dimension (n — k) x {n — k): M = Then 

/ (n-L bd*c)‘^ \ 

, ,7 . Observe that = 0 and = M“. 

\d*c(a bd*cy J 

Suppose that (S', E) is a semiring-semimodule pair and consider T = S x V. 
Define on T the operations 

(s, u) • (s', v) = (ss', u -\- sv), (s, u) -\- (s', n) = (s -I- s', u-\- v) 

and constants 0 = (0,0) and 1 = (1,0). Equipped with these operations and 
constants, T satisfies the equations 



{x -\- y) -\- z = X -\- {y -\- z) , x -\- y = y -\- x, x-\-Q = x, (1) 

{x ■ y) ■ z = X • {y • z), x ■ 1 = x, 1 ■ x = x, (2) 

{x-^y) ■ z = {x ■ z) {y ■ z), (3) 

0 • X = 0. (4) 



Elgot[4] also defined the unary operation ^ on T: (s, m)^ = (s, 0). Thus, ^ selects 
the “first component” of the pair (s,m), while multiplication with 0 on the right 
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selects the “second component”, for (s,u) • 0 = (0,u), for all u G V. The new 
operation satisfies: 

• (y + z) = (a;1[ • y) + (a;^ • z), (5) 

a; = + (x • 0), • 0 = 0, (6) 

(a: + y)1f = a;^ + ylf, (a: • y)^ = a:^ • (7) 

Note that when V is idempotent, also 

a;-(y + z) = a;-y + a:-z 



holds. 

Elgot[4] defined a quemiring to be an algebraic structure T equipped with 
the above operations and constants 0,1 satisfying the equations (l)-(4) 

and (5)-(7). It follows from the axioms that a:^^ = a;^, for all a; in a quemiring 
T. Moreover, = a; iff a: • 0 = 0. 

When r is a quemiring, S = = {a^ | a G T} is easily seen to be a 

semiring. Moreover, V = TO = {a • 0 | a G T} contains 0 and is closed under 
+, moreover, sa G E for all s G S' and x G V. Each a G T may be written in a 
unique way as the sum of an element of and a sum of an element of TO, viz. 
a = al + a • 0. Sometimes, we will identify S x {0} with S and {0} x V with V. 
It is shown in Elgot [4] that T is isomorphic to the quemiring S xV determined 
by the semiring-semimodule pair {S,V). 

Suppose now that (S, V) is a starsemiring-omegasemimodule pair. Then we 
define on T = S x E a generalized star operation: 

(s, a)® = (s*, s“ -I- s*v) 



for all (s, v) G T. 



2 cj-Algebraic Systems 

In the sequel, T is a quemiring, Y = {yi, . . . ,y„} is a set of (quemiring) vari- 
ables, T% = S and TO = V. A product term t has the form . . . , j/„) = 
soyqsi . . . Sk-iVi^Sk, k>0, where sj G S- {0}, 0 < j < k, Sk € S, and y^ G Y. 
The elements Sj are referred to as coefficients of the product term. If fc > 1, we 
do not write down coefficients that are equal to 1. 

A sum-product term p is a finite sum of product terms tj, i. e., 

p{yi,---,yn) = Yl tj{yi,---,yn)- 

l<j<m 

The coefficients of all the product terms tj, 1 < j < m, are referred to as 
the coefficients of the sum-product term p. Observe that each sum-product term 
represents a polynomial of the polynomial quemiring over the quemiring T in the 
set of variables Y in the sense of Lausch, Nobauer [10], Chapter 1.4. For a subset 
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S' C S, we denote the collection of all sum-product terms with coefficients in 
S' by 5"(F). Observe that the sum-product terms in S{Y) represent exactly the 
polynomials of the subquemiring of the polynomial quemiring that is generated 
by S' U y. 

We are only interested in the mappings induced by sum-product terms. These 
mappings are polynomial functions on T in the sense of Lausch, Nobauer [10], 
Chapter 1.6. 

Each product term t (resp. sum-product term p) with variables 
induces a mapping i (resp. p) from T” into T . For a product term t represented 
as above, the mapping i is defined by 

t(Ti , . . . , Tu) — SQTi-^ Si . . . Sk—lTi^ Sk , 

and for a sum-product term p, represented by a finite sum of product terms tj 
as above, the mapping p is defined by 

P(ti,...,T„) = ^ tj(Ti,...,T„) 

for all (ti, . . . ,r„) G T". 

Let (S,V) be a semiring-semimodule pair and let S x E be the quemiring 
determined by it. Let S' C S. An S' -algebraic system {with variables yi, . . . , 
over the quemiring S x E is a system of equations 

Vt=Pt, 1 < f < n, 

where each pi is a sum-product term in S'{Y). A solution to this S"-algebraic 
system is given by (ri, . . . , r„) G T" such that Ti = Pi(ri, . . . , r„), 1 < f < n. 

Often it is convenient to write the S"-algebraic system yi = pi, 1 < z < n, in 
matrix notation. Defining the two column vectors 





f yi\ 


f Pl\ 


y = 


'. and p = 


. 




\ynj 


\Pn) 



we can write yi = Pi, 1 < z < n, in the matrix notation 

y=p{y) or y = p. 

A solution to y = p{y) is now given by r G T" such that r = p(r) with p = 
{Pi)l<i<n- 

Consider now a product term t(z/i, . . . , z/„) = soj/qsi . . . Sfe_iz/i^Sfc and let 
Ti = (aijOJi) gS'xC, l<z<n. Then 

t{Tij ... j Tfi) — ^o{^ii : )^1 ■ ■ ■ ^k—li'Ti^ , OJi^ — 

Si . . . Sf^—±(7if_ Sf^^ SQUJi^ SQ(Ti^ S±UJi2 “t“ * * * “t“ Si . . . Sf^—20'i^_^ Sf^—iiOif_ ) . 

By definition, for a = (cti, . . . , cr„) G S'", 

ta{^l^ ■ • ■ : ^n) — S()0'i^SiZi2 “t“ * * * “t“ SQfTi^Si . . . Sfc—iZj^ 
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and, ■ ^Vn)^ 

■ t ^n) — ^ ^ • 1 ^n) ■ 

l<j><m 

Here Z\, . . . , Zn are variables over the semimodule V . We now obtain 



t(ri, . . . ,r„) = t((Ti, . . . ,CT„) + tcr(u;i, . . . , w„) 



and 



Moreover, 



p(ri,...,r„) =p((Ti, . . . ,CT„) +p,^(wi,...,w„). 



p(ri,...,T„)^ = p(<Ti,...,cr„) and p(n, . . . , r„).0 = Po-(wi, . . . , w„) . 



In the next theorem, y (resp. x and z) denotes a column vector 



yi 



(resp. 



\yn 



\ 


f zi\ 


and 


: 


\a:„ / 


\ZnJ 



), where the yi (resp. Xi and Zi) are variables over S x V 



(resp. S and V). 

In the sequel, S' will always denote a subset of S containing 0 and 1. The 
S''-linear systems (over V) occuring in the next theorem are defined in Esik, 
Kuich [7] before Theorem 4.1. The S''-algebraic systems (over S) occuring in the 
next theorem are defined in Kuich [9]. 



Theorem 2.1 Let S x V be a quemiring and let y = p{y) be an S' -algebraic 
system over S xV. Then (cr,oj) G (S x K)" is a solution of y = p{y) iff a is a 
solution of the S' -algebraic system x = p{x) over S and u> is a solution of the 
ilLlg^S') -linear system z = Pa{z) over V . 



Proof. T = {a, to) is a solution t = p(r) = p{a) + Pa-ico) cr = p{a) and 

LO=p„{uj). □ 



The following definition is given just for the purpose of the present paper. 
A semiring-semimodule pair (S', V) is called continuous if (S, V) is a complete 
semiring-semimodule pair and S is a continuous semiring. A quemiring is called 
continuous if it is determined by a continuous semiring-semimodule pair. 

Consider an S'-algebraic system y = p{y) over a continuous quemiring S xV. 
Then the least solution of the S'-algebraic system x = p{x) over S, say cr, exists. 
Moreover, write the 2l[g(S')-linear system z = Pa{z) over V in the form z = Mz, 
where M is an n x n-matrix. Then, by Theorem 4.1 of Esik, Kuich [7], 
for 0 < fc < n is a solution of z = Pa{z). Hence, by Theorem 2.1, (cr, 

0 < A: < n, is a solution of y = p{y). Given a fc G {0,1,..., n}, we call this 
solution the solution of order k ofy = p{y). By w-2Ug(S') we denote the collection 
of all components of solutions of order k of S'-algebraic systems over S x V. 
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We now consider a continuous semiring-semimodule pair {S {{A*)) , S {{A^))) , 
where S' is a commutative (continuous) semiring and A is an alphabet, and the 
continuous quemiring S{{A*)) x S{{A‘^)). 

Let SA* = {sw I s G S, w € A*}. Then w-2tlg(SA*) is equal to the collection 
of the components of the solutions of order k of SA*-algebraic systems over 
S{{A*)) X S{{A‘^)) yi = Pi, I < i < n, where pi is a polynomial in S{{A U T)*). 
This is due to the commutativity of S: any polynomial function that is induced 
by a sum-product term of SA*(T) is also induced by a polynomial of S((AUF)*) 
and vice versa. We denote o;-2U0(SA*) by S‘^'®'^®((A*, A“)). The SA*-algebraic 
systems are called to-algehraic systems {over S and A) and the power series in 
S““'^ig((A*, A‘^)) are called oj-algebraic power series {over S and A). 

Consider now a product term in S'((A U T)*) 

t{yi,...,yn) = swoUt^wi . . .Wk-iyi,,Wk , 

where s G S and Wi G A*, 1 < i < k. By definition, for x = {xi)i<i<n, 
tx{xi,...,Xn,Zi,...,Zn) = SWQZi^ + SWQXi^WiZi^ -f • • • -f SWoa^JiWl--- 

Wk- 2 X^^,_^Wk-lZ^^ , and, if p(yi, . . . , y„) = Ei<j<m > J/")> then 

Vx{x\ , ■ ■ ■ , Xji , Z\, , Zji) — ^ ^ (tj )a; (xi , . . . , Xji , Zi , . . . , Zji) . 



Here x\, . . . ,Xn (resp. z\,. . . , Zn) are variables over S (resp. V). Observe that, 
for (j G (^((H*)))”, we obtain p^{ai, . . . , a„, Zi, . . . , z„) = p,r{zi, . . . , z„). 

Given an w-algebraic system y = p{y) over S' ((H*)) x S((H“)), we call x = p{x), 
z = Px{x,z) the mixed to-algehraic system over {S{{A*)), S{{A‘^))) induced by 

y = p{y)- 

Write z = Px{x, z) in the form z = M{x)z, where M{x) is an n x n-matrix. 
Then {a, M{a)^’=) for 0 < /c < n is a solution of a: = p{x), z = Px{x, z). Moreover, 
it is the solution of order k of y = p{y). 



3 cp-Context-Pree Grammars 



A mixed co- context-free grammar 



G = {n,A,P,j,k) 



is given by 

(i) an alphabet X = {x\, . . . ,Xn} of variables for finite derivations and an 
alphabet Z = {zi, . . . , z„} of variables for infinite derivations, n > 1, X D 
Z = d}; 

(ii) an alphabet A of terminal symbols, A n {X U Z) = 0; 

(iii) a finite set of productions of the form x ^ a, x G X, a G {X U A)*, or 
z ^ az', z, z' G Z, a G {X U A)*; 

(iv) the startvariable Xj (resp. Zj) for finite (resp. infinite) derivations, 1 < f < n; 

(v) the set of repeated variables for infinite derivations {z\, . . . , Zfc}, 0 < k < n. 
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A finite leftmost derivation (with respect to G) a w, a G (XUA)*, w G A* , is 
defined as usual. An infinite leftmost derivation (with respect to G) n : z w, 
z G Z, w G A^, defined as follows: 



TT : z^L aiZji Wla2Z^^ wiW2Zi^ 

W 1 W 2 . ■ . WraZi^ W 1 W 2 ■ ■ ■ WmOm+l^i^+i ■ ■ ■ , 

where z ^ alZ^^,Zi^ 02242 , • ■ ■ , Zi^ ^ Om+iZi^^^,. . . G P, wi,W 2 , ■ ■ .,Wm, ■ ■ ■ 

G A* and w = w\W 2 ---Wm Let INV(7 t) = {z G Z | z is infinitely often 

rewritten in tt}. Then L{G) = {w G A* | Xj w} U {w G A‘^ | tt : Zj 

w, INV(7t) n {zi, . . . ,Zfc} yf 0}. 

We now discuss the connection between mixed w-algebraic systems over 
(S'((A*)), S'((A“))), where S' is B or N°°, and mixed w-context-free grammars. 
Define, for a given mixed w-context-free grammar Gj^k = (n, A, P, j,k), 1 < 
j < n, 0 < k < n, the mixed w-algebraic system Xi = pi{xi, . . . , Xn), Zi = 
qi{xi,...,Xn,zi,...,Zn), I < i < u, over (S((A*)), S((A“))) by 

(pi, a) = 1 if Xj ^ a G P, {pi, o) = 0 otherwise , 

(gi, a) = 1 if Zi ^ a G P, {qi, a) = 0 otherwise . 

Conversely, given a mixed w-algebraic system Xi = pi(xi, . . . , x„), Zi = 
qftxi , . . . , zi, . . . , z„), 1 < z < n, define the mixed w-context-free grammars 
Gj^k = {n,A,P,j,k), 1 < j < n, Q < k < n, hy Xi ^ a G P iA {pi,a) ft 0 
and Zi ^ Of G P iff (zi,a) ft 0. Whenever we speak of a mixed w-context-free 
grammar corresponding to a mixed w-algebraic system or vice versa, then we 
mean the correspondence in the sense of the above definition. 

In the next theorem we use the isomorphism between B((A*)) x B((A“)) and 

‘P(A*) X <P(A“). 

Theorem 3.1 Let Gj^k = (n,A,P,j,k), l<j<n, 0<k<n, be a mixed 
Lv- context-free grammar and Xi = pftxi , . . . , x„), Zi = qftxi , . . . , zi, . . . , z„), 
1 < i < n, be the mixed to-algebraic system over (B((A*)),B((A“))) corresponding 
to it. Let (cr,T) be the solution of order k, 0 < k < n, of Xi = pi, Zi = qi, 
1 <i <n. Then L{Gj^k) = Uj Tj, l<j<n, 0<i<k. 

Proof. By Theorem 2 of Ginsburg, Rice [8], we obtain aj = {w G A* | xj w}, 
1 < j < zi, and by Esik, Kuich [7] we obtain Tj- = {w G A‘^ | tt : zj 
w, INV(7t) n {zi, . . . , Zfc} ft 0}, 1 < j < n, 0 < A: < n. □ 

If our basic quemiring is N°°((A*)) x N°°((A‘^)) we can draw some stronger 
conclusions. 

Theorem 3.2 Let Gj^k = (n,A,P,j,k), l<j<n, 0<k<n, be a mixed 
CO -context-free grammar and Xi = Pi(xi, . . . , x„), Zi = qftxi , . . . , zi, . . . , z„), 
1 < i < n be the mixed co-algebraic system over (N°°((A*)), N°°((A“))) corre- 
sponding to it. Let (cr, r) be the solution of order k, 0 < k < n, of Xi = pi, 
Zi = qi, 1 < i <n. Denote by dj{w), w G A* (resp. w G AP’ ) the number (possi- 
bly 00 ) of distinct finite leftmost derivations (resp. infinite leftmost derivations 
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7T with INV{tt) n {zi, . . . , Zk} 7 ^ from the variable Xj (resp. Zj), 1 < j < n. 
Then 



<Jj = dj{w)w and tj = dj{w)w, 1 < j < • 

w^A* wGA^ 

Proof. By Theorem IV. 1.5 of Salomaa, Soittola [11] and Esik, Kuich [7]. □ 

An CO -context-free grammar {with repeated variables) G = {<P, A, P, S, F) is a 
usual context-free grammar (<?, A, P, S) augmented by a set F C <P of repeated 
variables. (See also Cohen, Gold [3].) 

An infinite leftmost derivation tt with respect to G, starting from some string 
a is given by 

TT : a ai =>l 02 , 

where a,ai G (^U A)* and =^l is defined as usual. This infinite leftmost deriva- 
tion TT can be uniquely written as 



a = /3 oBo7o vqBo7o vo/3iBi7i7o 

voViBi7i7o V 0V1P2B2727170 ^*L ■■■ , 

where u, G A*, ^^,7^ G {<PUA)*, B^ /3*+ii3j+i7i+i G P, /3* Vi, the specific 
occurence of the variable Bi is not rewritten in the subderivation [3iBi7i 
ViBi7i and the variables of 71 are never rewritten in the infinite leftmost deriva- 
tion TT. This occurence of the variable Bi is called the i-th significant variable of tt. 
(Observe that the infinite derivation tree of tt has a unique infinite path determin- 
ing the Bi’s.) We write also, for this infinite leftmost derivation, tt : a w for 
w = wqWi . . . Wn .... By definition, INV(7 t) = {A G j A is rewritten infinitely 
often in tt}. The co-language L{G) generated by the co- context-free grammar G is 
defined by 

L{G) = {w G A* \ S ^lw}U{w G A‘^ \tt : S INV(7t) n F yf 0} . 

An co-language L is called co- context-free if it is generated by an w-context- 
free grammar. (Usually, an w-language is a subset of A“. In our paper, it is a 
subset of A* U A“. 

The connection between an w-algebraic system over S{{A*)) x S{{A‘^)) and an 
w-context-free grammar is as usual. Define, for a given w-context-free grammar 
Gj = {{yi, • ■ • , Vn}, A, P, yj,{yi,..., yk}) the w-algebraic system y* = p*(yi, . . . , 
Vn), 1 < f < n, over S{{A*)) x S{{A^)) by {pi,a) = 1 if y* ^ a G P, {pi,a) = 0 
otherwise. Conversely, given an w-algebraic system yi = pi{yi, . . . , y„), 1 < i < n, 
define the w-context-free grammars Gj^k = {{vi, • • • , Vn], A, P, y^, {yi, . . . , yk}), 
1 < J < n, 0 < fc < n, by y* ^ a iff {pi, a) yf 0. 

Each w-context-free grammar G induces a mixed w-context-free grammar 
G' as follows. Let G = {d>,A,P,S,F), where without loss of generality, = 
{yi, • ■ • ,J/n}. S = yj, and P = {yi,...,yfe}. Then G' = {n,A,P',j,k), where 
P' is defined as follows. Let yi ^ a = woy^wi . . .Wt-iy^Wt G P, where 
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yi,yi-^, . . . ,yi^ G and wq,wi, . . . ,Wt G A*. Then we define the following set 
of productions 

Uy,^a = {Xi WqXi^Wi . . . Wt-lXi^Wt] U 

{Zi WoZi^^Zi WoX^^WlZi^,. ..,Zi^ WoXi^WiXi^ . . . , 

and, moreover, 

P'= IJ . 

Vi^aGP 

It is clear that, for a finite leftmost derivation yi w, w G A* in G, there 
exists a finite leftmost derivation Xi w in G' using only the x-productions. 
Moreover, for each infinite leftmost derivation in G 

Vi PlViill WlViill Wi/32y*2 727l 
WlW 2 yi 2 l 2 ll WiW2/332/*3 737271 ■■■ 

where yi is the 0-th, and yi. is the j-th significant variable, there exists the 
following infinite leftmost derivation in G'\ 

Zr PlZi^ WiZi^ Wij32Zi^ WiW2Zi^ WiW2PiiZi^ , 

where, if in j3i the y’s are replaced by x’s, we get /3i. Here Zi G 

Uyi^/ 3 iyi^-n and Zi^ [3j+iZi^^^ G • Both infinite leftmost 

derivations generate W 1 W 2 W 3 ■ • • G A‘^ . 

Vice versa, to each infinite leftmost derivation Zi w in G' there exists, 
in the same manner, an infinite leftmost derivation in G yi w, w G A^^. 
Moreover, if P' is the disjoint union of the Uy^^a for all yi ^ a G P, then the 
correspondence between infinite leftmost derivations in G and in G' is one-to-one. 

For an infinite leftmost derivation tt in an w-context-free grammar G, define 
INSV(7 t) = {yi G <P \ yi appears infinitely often as a significant variable in tt}. 
Clearly, if for all infinite leftmost derivations tt of the w-context-free grammar 
G = (^, H, P, S', F), INV(7t) n P yf 0 iff INSV(7 t) n F yf 0, then L(G') = L(G), 
where G' is the mixed w-context-free grammar induced by G. 

Theorem 3.3 Let Gj^k = {{yi, . . . ,yn}, A, P,yj, {yi, . . . ,yk}) , I < j < n, 
0 < k < n, be an co- context-free grammar and yi = pi{yi, . . . ,yn), 1 < * < n, be 
the uj-algebraic system over M{{A*)) x B((H“)) corresponding to it. Assume that, 
for each infinite leftmost derivation tt, INV{tt) n {yi, . . . , yk} ^ ii) iff INSV{tt) fl 
{yi, . . . ,yk\ yf 0- Let (cr, r) be the solution of order k, 0 < k < n, of the uj- 
algebraic system over (M{{A*)),M{{A‘^))) induced by yi = pi, I < i < n. Then 
L{Gj^k) = cfj + G'> 1 ^ J ^ 0 < i < k. 

Theorem 3.4 Let Gy = {{yi, . . . ,yn},A,P,yj,{yi, . . . ,yk}), I < j < n, 
0 < k < n, be an lv- context-free grammar and yi = Pi{yi, ■ ■ ■ ,yn), 1 < * < 
be the to-algebraic system over 'N°°{{A*)) x N°°((xl“)) corresponding to it. As- 
sume that, for each infinite leftmost derivation tt, LNV{tt) n {y\, . . . ,yk\ yf 0 
iff LNSV{tt) n {yi, . . . ,yk} yf 0- Denote by dj{w), w G A* (resp. w G A^ ) the 
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number (possibly oo) of distinct finite leftmost derivations (resp. infinite leftmost 
derivations tt with INSV{tt) n {yi, . . . , yk} 7^ from the variable yj, 1 < j < n. 
Then 



(Ji = 



E 

w^A* 



dj{w)w and tj = dj{w)w, 1 < J < • 



wGA^ 



Observe, that if fc = n or n = 1 , then the assumption INV(7r)n{j/i, . . . , yk} ^ 
0 iff INSV(7 t) n {yi , . . . , j/fc} 7^ 0 for all TT is satisfied. 



Example 3.1 (see also Cohen, Gold [ 3 ], Example 3 . 1 . 6 ). Consider the w-algebraic 
system over B((A*)) x B((A“)) where A = {a, 6}: yi = ayib + ab, j/2 = 2/i2/2- It 
induces the mixed w-algebraic system over (B((A*)), B((yl‘^))) x\ = ax\b + ab, 
X2 = X1X2, zi = azi, Z2 = zi + xiZ2- The least solution of x\ = ax\b + ab, 

X2 = X1X2 is given by cr = ^X)n>i . The z-equations can be written 

in the form z = Mz, where M = ( ^ ^ • We obtain ) and 

Xi J J 



= 

Xi-\-x\ a" 

The w-context-free grammar G corresponding to the w-algebraic system has 
productions y\ ay\b, y\ ab, y2 yiJ/2- The infinite leftmost derivations 
are 



(i) yi ayib aayibb ■■■, i-e., yi a^, with 

repeated variable yi; 

(ii) V 2 2/12/2 a"i6”^2/2 a”fo"^//i2/2 a"i6"C . . a"fo"‘2/2 ■■■, 

i. e., 2/1 . . . a^^b"^* . . . , with repeated variables 2/i> 2/2; 

(iii) 2/2 ■ • .a”‘6”*2/2 . . . a”* 6”* 2/1 2/2 

i. e., 2/2 . . . a”‘6”‘a“, t > 0 , with repeated variable y\. 

If 2/1 is the only repeated variable, and 2/1 or 2/2 is the start variable, then 
= E„>i a"6" + a- or L(G2.i) = (e„>i (e„>i re- 

spectively. If the repeated variables are 2/1 and 2/2 , and 2/1 or 2/2 is the start variable 
then we obtain again T(Gp2) = Z)n>i O'" ^(G'2,2) = (Z)n>i U 

(Sn>i a“, respectively. Compare this with the solutions of order 1 or 

2 of the w-algebraic system i/i = ayib + ab, 2/2 = 2/12/2: (X)n>i o) + 

{0,-, {En>i a-y or (E„>i o)^ + (a“, (E„>i a”6")E 

(X)n>i®”^”) > respectively. If 2/1 is the only repeated variable and 2/2 



is the start variable then 




is missing. That is due to the fact that 



in the derivations (ii) each i/i derives a finite word by a finite leftmost 

subderivation 2/1 and never is a significant variable. 
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If all variables are repeated variables that does not matter: each infinite 
leftmost derivation contributes to the generated language. Hence, if the repeated 
variables are yi,U 2 and the start variable is yi or y 2 , the infinite parts of the 
solutions of order 1 or 2 correspond to the generated languages by Theorem 3.3. 

□ 

In the next example there is only one variable. Hence, we can apply Theo- 
rems 3.3 and 3.4. 

Example 3.2. Consider the w-algebraic system yi = ayiyi + b over N°°((H*)) x 
N°°((H“)), where A = {a,b}. The least solution of the algebraic system xi = 
axixi + b over N°°((H*)) is given by cr = D*b, where D is the characteristic series 
of the restricted Dyck language (see Berstel [1]). The mixed w-algebraic system 
over (N°°((H*)),N°°((H‘^))) x\ = ax\Xi + b, Z\ = az\ + ax\Zi has the solution of 
order 1 {D*b, {a + axi)^{D*b)) = {D*b, {a + aD*b)^) = {D*b, {a + D)“), since 
aD*b = D. 

The w-context-free grammar corresponding to yi = ayiyi + b has productions 
yi — > ayiyi, yi ^ b and generates the language D*b+{a+D)‘^ = D*b+{a*D)‘^ + 
{a*D)*a‘^ . 

Since each word in (a*D)* and in (a*D)^ has a unique factorization into 
words of a*D, all coefficients of D*b+{a + D)'^ are 0 or 1, i. e., the w-context-free 
grammar with productions y\ ayiyi, yi ^ bis an “unambiguous” w-context- 
free grammar. □ 

Let (S', V) be a continuous starsemiring-omegasemimodule pair and inspect 
the solutions of order k: If (a, oj) is a solution of order k of an S'-algebraic system 
over S X V then a G 2llg(S') and uj is the /c-th automata theoretic solution of a 
finite 2l[g(S')-linear system. Hence, by Theorems 3.9, 3.10, 3.2 of Esik, Kuich [6] 
and by Theorem 4.4 of Esik, Kuich [7], oj is of the form oj = 

Sjjtj G lHat(2l[g(S')) = 2Ug(S'). Hence, again by Theorem 3.9 of Esik, Kuich [6] 
and by Theorem 3.10 of Esik, Kuich [7] we obtain the following result. 

Theorem 3.5 Let (S, V) be a continuous starsemiring-omegasemimodule 
pair. Then the following statements are equivalent for (s,u) G S xV: 

(i) (s,v) = ||2l||, where 21 is a finite automaton, 

(a) (s,v) = ||2l||i, where 2t is a finite automaton, 

(Hi) {s,v) G w-2lIg(S'), 

(iv) s G 2Ug(S') and v = J2i<k<7n ^ktf, where Sk,tk G 2lIg(S'). 



Theorem 3.6 Let (S, V) be a continuous starsemiring-omegasemimodule 
pair. Then w-2Ug(S') is an u> -rationally closed quemiring. 



Proof. Since, by assumption, 0, 1 G S' we infer that 0, 1 G w-2Ug(S'). Assume 



now that {ai,u>i) and (cr 2 ,o’ 2 ) are in w-2lIg(S'). Then, by Theorem 3.5, (Ji,(T 2 G 



2tlg(S') and wi = Y.i<k<m, ^^2 = EixkXm^ 

2tlg(S'). We obtain 



for some sl,sl,t\,tl& 



{(Ji,uJi) + {(J2,UJ2) = {ai + (72, ^ 4^r+ XI 
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and 

(cri,wi) • (ct 2 ,W 2 ) = (cTicra, + ■ 

l<fc<mi l<fc<m2 

Hence, (cri,u;i) -I- (cr 2 ,u; 2 ) and ((Ti,wi) • (ct 2 ,W 2 ) are again in 
Moreover, we obtain 

(<Ti,wi)^ = (cti,0) 

and 

Hence, (cri,a;i)^ and (cti, wi)® are again in ^- 2113 ( 5 ") and cj-2l[g(S") is rationally 
closed. □ 

Notation 3.1.5, Definition 2.2.1 and Theorem 4.1.8(a) of Cohen, Gold [3] and 
Theorem 3.5(iv) yield the next result. 

Theorem 3.7 CFL^ = {LO C | LO G H“)), A an alphabet}. 

Let t G Then t is the X 2 -component of the least solution of an 

algebraic system Xi = Pi{x 2 , ■ ■ ■ ,Xn), 2 < i < n, over B’^'s((H*)). Consider the 
w-algebraic system over B((H*)) x B((H‘^)): 

yi = y2yi, yi = Pi{y2,---,yn), ‘^<i<n, 

and consider the induced mixed w-algebraic system over (B((H*)),B((H“))): 

Zl= Z 2 + X 2 Z 1 , Zi = {Pi)^{xi, . . . ,X„,Zi, . . -,Zn), 2 < z < n, 

Xi = X 2 X 1 , Xi = Pi{x 2 , .. .,Xn), 2 <i <n. 

The first component of the least solution of x\ = X2X1, Xi = Pi{x2, ■ ■ ■ ,Xn), 
2 < z < rz, is 0 . We now compute the solution of order 1 of zi = Z2 + X2Z\^ 
Zi = {Pi)x{x \, . . . , Xn, Z\, . . . , z„), 2 < z < n. We write the system in the form 
z = Mz and obtain 





/ X2 


e 0 ... 0 


M = 


0 


M' 




U 





Hence, the first component of is X 2 and the first component of the solution 
of order 1 is given by (0,t“). 

Consider now the w-context-free grammar G corresponding to y\ = y 2 Vi, 
yi = Pi, 2 < i < n, with the set of repeated variables {yi\ and start variable y\. 
The only infinite leftmost derivations tt, where y\ appears infinitely often, are of 
the form 



TI" : yi 2/22/1 ^*L Wiyi =^L W12/22/I =^*L WlW2yi =^L ■■ ■ ■ 

The only significant variable of such a derivation tt is z/i, i. e., INSV(7 t) = { 2 / 1 }, 
and INSV(7 t) n { 2 / 1 } yf 0 iff INV(7 t) C {yi} yf 0. Hence, L(Gip) = t'^ by Theo- 
rem 3.3. 
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The usual constructions yield then, for s + u, where v = J2i<k<n 
s, Sk,tk € an w-context-free grammar G' such that T(G') = s + v. 

Hence, we have given a construction proving again Theorem 3.7. But addi- 
tionally, G' has the nice property that for each infinite leftmost derivation tt, 
we obtain INSV(7 t) H F ^ 0 iff INV(7 t) n T’ yf 0, where F is the set of repeated 
variables of G'. 
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Abstract. It is well known that the family of regular languages (over 
alphabet A), accepted by finite automata, coincides with the set of sup- 
ports of the rational and recognizable formal power series over N with the 
set of variables A. Here we prove that there is a corresponding presenta- 
tion for languages accepted by integer weighted finite automata, where 
the weights are from the additive group of integers, via the matrices over 
Laurent polynomials with integer coefficients. 



1 Introduction 

It is well known that the family of languages accepted by a finite automata (over 
alphabet A), can be defined also with the set of recognizable formal power series 
over N, which on the other hand is equal with the set of rational formal power 
series over N, where A is considered as a noncommutative set of variables. This 
connection is proved by using the matrix representation of the finite automata. 

Here we give a similar representation for the family of languages accepted 
with the integer weighted finite automata, see [4,5]. In these automata the 
weights are from the additive group of integers and a word is accepted, if it 
has a successful path in the underlying automaton and the weight of the path 
adds up to zero. We show that there is a connection between these languages 
and the recognizable and rational formal power series with coefficients from the 
ring of the Laurent polynomials with integer coefficients. The proof uses the 
representation of the integer weighted finite automata with matrices over the 
Laurent polynomials. The difference between these two constructions is in the 
definition of the language defined with the series. 

Next we give the basic definitions on words and languages. Let A be a finite 
set of symbols, called an alphabet. A word over A is a finite sequence of symbols 
in A. We denote by A* the set of all words over A. Note that also the empty 
word, denoted by £, is in A*. 

Let u = ui ... Un and u = ui . . . Um be two words in A* , where each Ui and 
Vj are in A for 1 < i < n and 1 < j < m. The concatenation of u and v 
is the word u • v = uv = u\ . . . . . .Vm- The operation of concatenation is 

associative on A* , and thus A* is a semigroup (containing an identity element 
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e). Let A+ = A* \ {e} be the semigroup of all nonempty words over A. A subset 
L of A* is called a language. 

2 Formal Power Series 

Here we give the needed definitions and notations on formal power series. As a 
general reference and for the details, we give [2, 9, 11] . 

Let K he & semiring and A an alphabet. A formal power series S' is a function 

A* ^ K. 

Note that here A is considered as a (noncommutative) set of variables. The image 
of a word w under S is denoted by (S, w) and it is called the coefficient of w in 
S. The support of S is the language 

supp(S) = {w e A* I (S,w) yf 0}. 

The set of formal series over A with coefficients in K is denoted by K{{A)). A 
formal series with a finite support is called a polynomial. The set of polynomials 
is denoted by K (A) . 

Let S and T be two formal series in K{{A)). Then their sum is given by 
(S + T,w) = (S, w) + (T, w) 



and their product by 



(ST,w)= Y, (S,u){T,v). 

uv—w 

We also define two external operations of K in K{{A)). Assume that a is in 
K and S in K{{A)), then the series aS and Sa are defined by 

(aS,w) = a{S,w) and (Sa,w) = (S,w)a. 

A formal series S can also be written in the sum form S = "^OwW over all 
w G A* such that is the coefficient of w in K, i.e. (S,w) = a^. 

A formal series S in K{{A)) is called proper if the coefficient of the empty 
word vanishes, that is (S', e) = 0. Let S be proper formal series. Then the family 
(S")„>o is locally finite (see [2]), and we can define the sum of this family, 
denoted by 

S* = ^ S" 

n>0 

and it is called the star of S. Note that S° = 1, S^ = S and S” = SS”“^ , where 
1 is the identity of K under product. 

The rational operations in K{{A)) are the sum, the product and the star. A 
formal series is called rational if it is an element of the rational closure of K{A), 
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i.e. it can be defined using the polynomials K{A) and the rational operations. 
The family of rational series is denoted by K'^^'^{{A)). 

As usual, we denote by the set of the m x n matrices over K. 

A formal series S G K{{A)) is called recognizable if there exists an integer 
n > 1, and a monoid morphism n : A* ^ j^nxn^ multiplicative structure 

of and two vectors t, p G AT” such that for all words w, 

{S, w) = igi{w)p^ . 

The triple {i, p, p) is called a linear representation of S with dimension n. The 
set of recognizable series over K is denoted by Ar''®'^((A)) 

The next theorem is fundamental in the theory of rational series. It was first 
proved by Kleene in 1956 for languages that are those series with coefficients 
in the Boolean semiring. It was later extented by Schiitzenberger to arbitrary 
semirings. For details, see [2, 9, 11]. 

Theorem 1. A formal series is recognizable if and only if it is rational. 

3 Finite Automaton 

A (nondeterministic) finite automaton is a quintuple A = {Q, A, <5, qji, F), where 
Q is a finite set of states, A is a finite input alphabet, 5: Q x A ^ 2'^ \s & 
transition function, qji € Q is an initial state and F is the set of final states. A 
transition p G 6{q,a), where p,q € Q and a G A, will also be written as {q, a,p), 
in which case liCQxAxQis regarded as a relation (and sometimes also as 
an alphabet). Without loss of generality, we can assume that 

Q = {1, 2, . . . , n} for some n > 1, and qji = 1 . 

Indeed, renaming of the states will not change the accepted language. 

A path 7T of A (from qi to qn+i) is a sequence 

TT = tit 2 ...tk where = (gi, Oj, gi+i) G (5 (1) 

for i = 1,2, . . . , fc. If we consider S as an alphabet, then we can write tt G <5*. 
The label of the path tt in (1) is the word ||7r|| = ai 02 . . . Ofc. Let 

A{w : p ^ q) = {tt \ TT a, path from p to q with ||7r|| = w} . 

Moreover, a path tt G A{w : p ^ q) is successful (for w), if p = 1 and q G F. 
The language accepted by A is the subset L(A) C A* consisting of the labels of 
the successful paths of A\ 

L{A) = {w G A* I 7T G A{w : I ^ q) for some q G F} . 

It is well-known that each finite automata has a matrix representation ob- 
tained as in the following. Let A = {Q,A,S, 1,F) be a finite automaton with n 
states, i.e., Q = {1,2,..., n}. Define for all a G A, the matrix Ma G by 



{Ma)ij 



1, itjGS{i,a) 
0, otherwise. 



(2) 
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We define a monoid morphism ^ : A* ^ pjnxn setting n{a) = Ma, where the 
operation in is the usual matrix multiplication. 

Let i = (1,0,. ..,0), where only the first term is nonzero, and let p = 
{pi,P 2 , ■ • ■ , /On) in N" where 

1 , , 3 ) 

0, otherwise. 

The triple (i, p, p) is then called the linear representation of A. 

For the proof for the following theorem, see [2,9, 11]. 

Theorem 2. A language L is accepted by a finite automaton if and only if there 
exists a linear representation (z, p, p) such that 

w € L 4=^ ip{w)p^ yf 0. 

Note that we could have defined the matrices over the boolean semiring B 
instead of the semiring N, and then replacing ip{w)p^ yf 0 by ip{w)p^ = 1. But 
using the ring N, we achieve the following advantage. 

Theorem 3. For a finite automaton A having a linear representation {i,p,p), 
the value ip{w)p^ equals the number of different successful paths in A for w. 

By Theorem 2 and the fundamental theorem. Theorem 1, we get the following 
corollary. 

Corollary 1. L C A* is a regular language if and only if there exists a formal 
series Sl G N™*((^)) = such that L = supp(S'l). 

Note that it follows that the regular languages are closed under the rational 
operations, since N''®'*((gI)) is. 

4 Laurent Polynomials and Weighted Automata 

In this section we give a corresponding representation for the languages accepted 
by the integer weighted finite automata. We begin with some definitions. 

A Laurent polynomial p G Z[x,a;“^] with coefficients in Z is a series 

p{x) = . . . a- 2 X~^ + a-ix~^ + oo + aix + a 2 X^ + . . . , 

where there are only finitely many nonzero coefficients G Z. The constant 
term of the Laurent polynomial p G Z[x,x~^] is ag. The family of Laurent 
polynomials with coefficients in Z forms a ring with respect to the operations 
of sum and multiplication, that are defined in the usual way. Indeed, the sum 
is defined componentwise and the multiplication is the Cauchy product of the 
polynomials: 

oo \ / oo \ oo 

) = X! ( XI 

i— — oo / \i— — oo / i— — oo j-\-k=i 
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Note that in the definition of Laurent polynomials we could have used also 
arbitrary ring instead of Z, but here we need only the integer case. Actually, we 
concentrate on matrices over Laurent polynomials with integer coefficients, that 
is, the elements of Z[a;, for n > 1. A Laurent polynomial matrix 

M= (Cy)nx„ 

is a n X n-square matrix the entries of which are Laurent polynomials from 
Z[a;, x~^]. For these matrices, multiplication is defined in the usual way using the 
multiplication of the ring 1\x, x~^]. Indeed, if M\ = (cij)nxn and M2 = (dij)nxn, 
then 

Ml • M 2 = (Cij)nxnj 

where 

n 

Cij — ^ ^ G Z[x, X ]. 

fc=l 

Also the sum for these matrices can be defined, but we are interested in the 
semigroups generated by a finite number of Laurent polynomials under multi- 
plication. 

Next we consider a generalization of finite automata where the transitions 
have integer weights. The type of automata we consider is closely related to the 
1-turn counter automata as considered by Baker and Book [1], Greibach [3], 
and especially by Ibarra [8]. Also, regular valence grammars are related to these 
automata, see [7]. Moreover, the extended finite automata of Mitrana and Stiebe 
[10] are generalizations of these automata. 

Consider the additive group of Z of integers. A {Zi-) weighted finite automaton 
A'^ consists of a finite automaton A = {Q, A,6,l, F) as above, except that here 
S may be a finite multiset of transitions in Q x A x Q, and a weight function 
7: (5 ^ Z. We let i5 be a multiset in order to be able to define (finitely) many 
different weights for each transition of A. For example, it is possible that for 
ti,t2 G S, ti = (i,a,j) = t 2 and 7(^1) yf 7(^2)- 

Let 7T = t\t 2 . . . tfc be a path of A, where U = {qi, ai, Qi+i) for i = 1, 2, . . . , fc. 
The weight of tt is the element 

7(7t) = 7(0 -k 7(^2) H k7(tfc). 

Furthermore, we let 

L{A'^) = {w G A* I 7(71) = 0, 7T G A{w : 1 ^ g) for some q & F} , 

be the language of A'^ . In other words, a word is accepted by A'^ if and only if 
there is a successful path of weight 0 in A^ . 

Next we shall introduce a matrix representation of integer weighted finite 
automata with the matrices over the Laurent polynomials Z[x,x~^]. 

Let A~* be a weighted finite automaton, where A = (Q, A, S, 1, F) and 7: i5 ^ 
Z. Let again Q = {1, 2, . . . , n}. Define for each element a € A and a pair of states 
i,jGQ the Laurent polynomial 
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P^j = 



E ^ 

t—{i,a,j)^6 



r7(i) 



Moreover, define the Laurent polynomial matrix Ma G Z[x, x for all a G A 

by 

{Ma)i,=p%. (4) 

Let p: A* Z[a;, be the morphism defined by p{a) = Ma- Let i and p 
be the vectors as in (3). The triple {i,p,p) is called a Laurent representation of 
A'^ . For completeness sake, we give here the proof of the following result of [6]. 

Lemma 1. Let {i,p,p) he a Laurent representation of A"* , and let w € A* . Then 
the coefficient of x^ in p{w)ij is equal to the number of paths tt G A{vj : i ^ j) 
of weight z . 

Proof. We write -M„ = p{w) for each word w. We prove the claim by induction 
on the length of the words. The claim is trivial, if w G A. Assume then that 
the claim holds for the words u,v G A+, and let {Mu)ij = _p“- = 
where af., is the number of paths from A(u : i ^ j) of weight 2 . Similarly, let 
{My)ij = pG = where is the number of paths from A{v : i ^ j) 

of weight 2 . Now, 



k—l k—1 zi Z2 

n n 

= E E = E ■ 

k—l 21,^2 -21,22 k — l 

In other words, the coefficient of x^ is equal to X)zi+Z 2 = 2 ; ^fe=i where- 

from the claim easily follows. 

The following result is an immediate corollary to Lemma 1. 

Theorem 4. Let {i, pt, p) be a Laurent representation of A~* , and let w G A* 
Then the constant term c of ip{w)p^ equals the number of different successful 
paths of w in A~^ . Ln particular, w G L(A’’') if and only if c> 0. 

Since Z[x,x~^] is a ring we can also study the formal power series 
Z[a;,a;“^]((A)). Note that the zero element of Z[x,x~^j is the zero polynomial, 
where all the coefficients are 0. By Theorem 1, we get the following corollary. 

Corollary 2. A language L C A* accepted with A'*' if and only if there exists a 
formal power series Sl G Z[x , x~^Y°'^ {{A)) = Z[a;, a;“^]^“((A)) such that 



w G L 



{Sl,w) 



n 

'Y, ao Y 0 . 



i—m 
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Note that the this corollary does not give any closure properties on the fam- 
ily of languages accepted with integer weighted finite automata. The closure 
properties of these languages were studied in [4] . For example, the family is not 
closed under star. 

Note also that the undecidability result in [5] gives undecidability result for 
matrices over Laurent polynomials, see [6]. 

Actually, for the power series Sl & Z[a;, a;“^]((A)) in Corollary 2, supp(S'i) = 
L{A), i.e., the support of 5^ is the regular language accepted by the underlying 
automaton of A.'*'. By reordering the terms according to powers of the variable 
X, we get 

n 

Sl=Y. (5) 

z—m 

where Lz is the sum of words of the language 

{w G A* \ TT G A{w : I ^ q) for some q G F, j{tt) = z} C A* , 

with multiplicities from Z. We denote these languages simply by Lz. Now Lq = 
L{A'^) = L and L{A) = Uz^zLz- Note that this union can be infinite, since the 
sum (5) can be infinite, even for both directions. It follows also by the rationality 
of Sl that the languages Lz are in the family of languages accepted with integer 
weighted finite automata, since x~^Sl G Ij[x,x~^Y^'^ {{A)) . 
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Abstract. Two models for gene assembly in ciliates have been proposed 
and investigated in the last few years. The DNA manipulations postu- 
lated in the two models are very different: one model is intramolecular 
- a single DNA molecule is involved here, folding on itself according to 
various patterns, while the other is intermolecular - two DNA molecules 
may be involved here, hybridizing with each other. Consequently, the 
assembly strategies predicted by the two models are completely differ- 
ent. Interestingly however, the final result of the assembly (including the 
assembled gene) is always the same. We compare in this paper the two 
models for gene assembly, formalizing both in terms of pointer reductions. 
We also discuss invariants and universality results for both models. 



1 Introduction 

Ciliates are unicellular eukaryotic organisms, see, e.g. [23]. This is an ancient 
group of organisms, estimated to have originated around two billion years ago. 
It is also a very diverse group - some 8000 species are currently known and 
many others are likely to exist. Their diversity can be appreciated by comparing 
their genomic sequences: some ciliate types differ genetically more than humans 
differ from fruit flies! Two characteristics unify ciliates as a single group: the 
possession of hairlike cilia used for motility and food capture, and the presence 
of two kinds of functionally different nuclei in the same cell, a micronucleus and 
a macronucleus, see [15], [24], [25]; the latter feature is unique to ciliates. The 
macronucleus is the “household” nucleus - all RNA transcripts are produced 
in the macronucleus. The micronucleus is a germline nucleus and has no known 
function in the growth or in the division of the cell. The micronucleus is activated 
only in the process of sexual reproduction, where at some stage the micronuclear 
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genome gets transformed into the macronuclear genome, while the old macronu- 
clear genome is destroyed. This process is called gene assembly, it is the most 
involved DNA processing known in living organisms, and it is most spectacular 
in the Stichotrichs species of ciliates (which we consider in this paper). What 
makes this process so complex is the unusual rearrangements that ciliates have 
engineered in the structure of their micronuclear genome. While genes in the 
macronucleus are contiguous sequences of DNA placed (mostly) on their own 
molecules (and some of them are the shortest DNA molecules known in Na- 
ture), the genes in the micronucleus are placed on long chromosomes and they 
are broken into pieces called MDSs, separated by noncoding blocks called lESs, 
see [15,21-26]. Adding to the complexity, the order of the MDSs is permuted 
and MDSs may be inverted. One of the amazing features of this process is that 
ciliates appear to use “linked lists” in gene assembly, see [29,30], similarly as in 
software engineering! 

Two different models have been proposed for gene assembly. The first one, 
proposed by Landweber and Kari, see [19,20], is intermolecular: the DNA ma- 
nipulations here may involve two molecules exchanging parts of their sequences 
through recombination. The other one, proposed by Ehrenfeucht, Prescott, and 
Rozenberg, see [11, 27], is intramolecular: here, all manipulations involve one sin- 
gle DNA molecule folding on itself and swapping parts of its sequence through 
recombination. In the intermolecular model one traditionally attempts to cap- 
ture both the process of identifying pointers and the process of using pointers 
by operations that accomplish gene assembly. In the intramolecular model one 
assumes that the pointer structure of a molecule is known, i.e., the pointers have 
been already identified. This implies some important differences between the 
models: e.g., the intramolecular representations of genes contain only pointers, 
with two occurrences for each pointer, and moreover, processing a pointer im- 
plies its removal from the processed string; these properties do not hold in the 
intermolecular model. Finally, the bulk of the work on the intermolecular model 
[1-3, 16-18] is concerned with the computational power of the operations in the 
sense of computability theory; e.g., it is proved in [18-20] that the model has the 
computational power of the Turing machine. On the other hand, research on the 
intramolecular model, see [4-6,8-10,12-14] and especially [7], deals with repre- 
sentations and properties of the gene assembly process (represented by various 
kinds of reduction systems). We believe that the two approaches together shed 
light on the computational nature of gene assembly in ciliates. 

In this paper, we take a novel approach on the intermolecular model aim- 
ing to compare the assembly strategies predicted by each model. Therefore, we 
formalize both models in terms of MDS-IES descriptors and describe the gene 
assembly in terms of pointer reductions. We prove a universality result showing 
that the assembly power of the two models is the same: any gene that can be 
assembled in one model can also be assembled in the other. Nevertheless, the 
assembly strategies and the gene patterns throughout the process are completely 
different in the two models. Somewhat surprisingly, we show that the two models 
agree on the final results of the assembly process. 
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2 The Structure of Micronuclear Genes 

We shall now take a formal approach to gene assembly. The central role in this 
process is played by pointers. These are short sequences at the ends of MDSs 
(i.e., at the border of an MDS and an lES) - the pointer in the end of an MDS 
M coincides as a nucleotide sequence with the pointer in the beginning of the 
MDS following M in the macronuclear gene, see [22,25]. For the purpose of an 
adequate formal representation, the first (last, resp.) MDS begins (ends, resp.) 
with a specific marker b (e, resp.). It is enough for our purposes to describe any 
MDS by the pair of pointers or markers flanking it at its ends. The gene will then 
be described as a sequence of such pairs interspersed with strings describing the 
sequence of lES - we thus obtain MDS-IES descriptors formally defined in the 
following. For more details we refer to [7]. 

For an alphabet E and a string u over E, we will denote by [m] the circular 
version of string u - we refer to [7] for a formal definition. Let E = {a \ a € E} 
and u = aitt 2 ■ ■ ■ an, ai G E U E. The inverse of u is the string u = a„ . . . 0201, 
where a = a, for all a G E. The empty string will be denoted by A. 

Let M = {5, e, 6, e} denote the set of the markers and their inverses. For 
each index k > 2, let 

Z\„ = { 2,3, . . . , k} and 7T„ = Z\kUZ\k. 

An element p G 77^ is called a pointer. Also let 

G = {(5,e),}U{(6,t),(t,e) | 2 < i < k} U { {i, j) \ 2 < i < j < k} 

and Tk = {{P,a) \ (a, (3) G G}. A string S over U is called an MBS 
descriptor if 

(a) S has exactly one occurrence from the set {6, 6} and exactly one occurrence 
from the set {e, e}; 

(b) S has either zero, or two occurrences from {p,p}, for any pointer p G 77k. 

Let = {7i, 72 , . . . , Ik-i} and = {7 | 7 G Any string r over 
is called an lES-descriptor if for any 7 G 7 ?k, l. contains at most one occurrence 
from {7, 7}. 

A string 7 over 7^^ U 7^„ U U 17^ is called an MDS-IES descriptor if 
7 = ii{pi,qi)i 2 {p 2 ,q 2 ) ■ . .i„(p„,g„)r„+i, 

where l\L 2 . . . tn+i is an lES-descriptor, and (pi, qi) . . . {pn, qn) is an MDS-descrip- 
tor. We say that 7 is assembled if 7 = Li{m, m')t 2 for some lES-descriptors i-i, L 2 
and m,m' G M. If {m,m') = (b,e), then we say that 7 is assembled in the 
orthodox order and if (m, m') = (e,b), then we say that 7 is assembled in the 
inverted order. 

A circular string [7] is an (assembled) MDS-IES descriptor if 7 is so. 

Example 1. The MDS-IES descriptor associated to the micronuclear actin I gene 
in S.nova, shown in Fig. 1, is M^IiMil 2 MQl^M^IiMTl^MglQM 2 l 7 MiI^M^. □ 
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3 4 6579218 

H I — m K i 

Fig. 1. Structure of the micronuclear gene encoding actin protein in the stichotrich 
Sterkiella nova. The nine MDSs are in a scrambled disorder 



3 Two Models for Gene Assembly 

We briefly present in this section the intramolecular and the intermolecular mod- 
els for gene assembly in ciliates. We then formalize both models in terms of 
pointer reductions and MDS-IES descriptors. For more details we refer to [7, 11) 
19,20,27]. 

3.1 The Intramolecular Model 

Three intramolecular operations were postulated in [11] and [27] for gene as- 
sembly: Id, hi, and diad. In each of these operations, a linear DNA molecule 
containing a specific pattern is folded on itself in such a way that recombination 
can take place. Operations hi and dIad yield as a result a linear DNA molecule, 
while Id yields one linear and one circular DNA molecule, see Figs. 2-4. The 
specific patterns required by each operation are described below: 




Fig. 2. Illustration of the Id molecular operation 



(a) 



(b) (c) 



(d) 



Fig. 3. Illustration of the hi molecular operation 
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(a) 





(c) 



Fig. 4. Illustration of the diad molecnlar operation 



(i) The Id operation is applicable to molecules in which two occurrences (on the 
same strand) of the same pointer p flank one lES. The molecule is folded 
so that the two occurrences of p are aligned to guide the recombination, see 
Fig. 2. As a result, one circular molecule is excised. 

(ii) The hi operation is applicable to molecules in which a pointer p has two 
occurrences, of which exactly one is inverted. The folding is done as in Fig. 3 
so that the two occurrences of p are aligned to guide the recombination. 

(iii) The dIad operation is applicable to molecules in which two pointers p and 
q have interspersed occurrences (on the same strand): p — q — p — q. The 
folding is done as in Fig. 4 so that the two occurrences of p and q are aligned 
to guide the double recombination. 

Operations Id, hi, and diad can be formalized in terms of reduction rules |d, 

hj, and diad for MDS-IFS descriptors as follows: 

(1) For each p G iT„, the Id-rule for p is deflned by: 

\Ap{Si{q,p)Li{p,r)S2) = 5i{q,r)52 + [ti] , 
Idp(ti(p,m)i2(m',p)i3) = ills -I- [(TO',m)i2] , 

where g, r G 77 k U M, 5\,52 are MDS-IFS descriptors, ti,i 2 ,i 3 are IFS de- 
scriptors, and TO, to' G M. 

(2) For each p G 77k, the lu-rule for p is deflned by: 

hip{Si{p, q)S2{p,r)S3) = SiS2{q,r)S3, 
hlp{Si{q,p)S2(f,p)S3) = Si{q,r)S2S3 , 

where g, r G 77 k and <5i, i52 G (7A U 72)^. 

(3) For each p,q G II k, p ^ q, the diad rule for p and g is deflned by: 

dlad^ ,^(i5i (jP,ri)i52(g,r2)i53(r3,p)(54(r4,g)(5.0 = i5i(54(r4, r2)<53(r3, ri)i52i55 , 
dladp .^((5i (jP, ri)52{r2,q)53{r3,p)54:{q, r4)<55) = i5i(54(53(?"3, ?"i)(52(r2, T4)i55 , 
dladp .^(i5i (ri ,jp)i52(g,r2)i53(p,r3)(54(r4,g)(5..s) = i5i(ri, r3)(54(r4, T2 )i53(52i55 , 
^^p,q{^i{‘>'i,p)52{r2,q)53{p,r3)6A{q,rA)53) = i5i(ri, r3)<54<53(52(r2, r4)(5s , 
d]adp,q('5i(p,ri)i52(g,_p)(54(r4,g)i55) = <5i<54(?’4, d)'52'55 , 
dlad^ ,^((5i (p,g)(53(r3,p)i54(g.r4)(5.0 = <5i<54(53(r3, r4)(5s , 
dlad^ ,^(i5i (ri ,ip)i52(g,r2)i53(p, g)i5s) = <5i(ri, r2)i53(52i55 , 

where M, r 2 , rs, r 4 , rs G 77 k, and i5i, <52, <^ 3 , <54, G (7"k U 72)^. 
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Note that each operation removes one or two pointers from the MDS-IES 
descriptor. When assembled (on a linear or on a circular string), the descriptor 
has no pointers anymore. Thus, the whole process of gene assembly may be 
viewed as a process of pointer removals. 

If a composition ip of Id, hi, and diad operations is applicable to an MDS- 
IES descriptor 7, then (p{'j) is a set of linear and circular MDS-IES descriptors. 
We say that is a successful reduction for 7 if no pointers occur in any of the 
descriptors in 

Example 2. Consider the MDS-IES descriptor 6 = (6, 2)/i(2, 3)/2(4, e)/3(3, 4). 
An assembly strategy for this descriptor in the intramolecular model is the fol- 
lowing: 



^4(5) = {b,2)h{2,e)hl2, 
M2(dlad3.4(<5)) = {b,e)hh + [h]. 



□ 



3.2 The Intermolecular Model 

Three operations were postulated in [19] and [20] for gene assembly. One of these 
operations is intramolecular: it is a sort of a generalized version of the Id opera- 
tion, while the other two are intermolecular: they involve recombination between 
two different DNA molecules, linear or circular, see Figs. 5-6. We describe these 
operations below in terms of pointers, similarly as for the intramolecular model. 

(i) In the first operation a DNA molecule containing two occurrences of the same 
pointer x (on the same strand) is folded so that they get aligned to guide 
the recombination, see Fig. 5. Note that unlike in Id, the two occurrences of 
X may have more than just one lES between them. 

(ii) The second operation is the inverse of the first one: two DNA molecules, 
one linear and one circular, each containing one occurrence of a pointer x get 
aligned so that the two occurrences of x guide the recombination, yielding 
one linear molecule - see Fig. 5. 

(iii) The third operation is somewhat similar to the second one: two linear 
DNA molecules, each containing one occurrence of a pointer x get aligned so 
that the two occurrences of x guide the recombination, yielding two linear 
molecules, see Fig. 6. 

Note that the three molecular operations in this model are reversible, unlike 
the operations in the intramolecular model - this is one of the main differences 
between the two models. 

We formalize now this intermolecular model in terms of reduction rules for 
MDS-IES descriptors. The three operations defined above are modelled by the 
following reduction rules for MDS-IES descriptors: 
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Fig. 5. Illustration of the intramolecular operation of the Landweber-Kari model 



X 

. — ^ . 

- ^ ^ 

Fig. 6. Illustration of the intermolecular operation of the Landweber-Kari model 



5i{q,p)52{p,r)6:i (5i(g, r)(53 -b [<52], (1) 

Si{p,q)S2{r,p)63 6163 + [S2(r,q)], ( 2 ) 

Si{p,q)S2 + [{r,p)S3] 5i53{r,q)52, (3) 

5i{q,p)52 + [{p,r)53\ Si{q,r)S3S2, (4) 

6i{p,q)d2 + S3{r,p)5i S164 + 63{r, q)62, (5) 



where 61,62, 63 e (r«; U U U !?„)*. 

Note that each reduction rule above removes one pointer, thus making the 
whole process irreversible. Although the intermolecular model was specifically 
intended to be reversible, this restriction helps in unifying the notation for (and 
the reasoning about) the two models and it suffices for the results presented in 
this paper. 

If a composition p of the reduction rules (l)-(5) is applicable to an MDS- 
lES descriptor 7, then is a set of linear and circular MDS-IES descriptors. 
We say that is a successful reduction for 7 if no pointers occur in any of the 
descriptors in 

Example 3. Consider the MDS-IES descriptor 6 = (6, 2)/i(2, 3)/2(4, e)/3(3, 4) of 
Example 2. An assembly strategy for this descriptor in the intermolecular model 
is the following: 

5 ^ (6,2)/i(2,4) + [/2(4,e)/3] ^ (6, 2)/i(2, 6)73/2 ^ (6,6)73/2 + [7i]. 

Note that although the assembly strategy is very different from the one in Ex- 
ample 2, the final result of the assembly, {(6, 6)73/2, [7i]} is the same in the two 
models. □ 
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4 Reduction Strategies in the Two Models 

The obvious difficulty with the intermolecular model is that it cannot deal with 
DNA molecules in which a pointer is inverted - this is the case, e.g., for the 
actin I gene in S.nova. Nevertheless, we can show that inverted pointers can be 
handled in this model, provided the input molecule (or its MDS-IES descriptor) 
is available in two copies. Moreover, we consider all linear descriptors modulo 
inversion. The first assumption is essentially used in research on the intermolec- 
ular model, see [16-18,20]. The second assumption is quite natural whenever 
we model double-stranded DNA molecules. As a matter of fact, we use the two 
assumptions to conclude that for each input descriptor, both the descriptor and 
its inversion are available. Then the ]u-rule can be simulated using the inter- 
molecular rules as follows. 

Let S = Si{p,q) 62 {p,f)S 3 (the other case is treated similarly) be an MDS- 
IES descriptor to which hip is applicable. Therefore, we assume that also S = 
S 3 ii",p)S 2 {q,p)Si is available. Then we obtain 

(5-1-5 5i 52 (g,p) 5i -I- 5s (r,g)52 (p,r) 5s 

^ 5i52(g,f)53 -l-53(r, g)525i = hip(5) -b hip(5) . 

Note that, having two copies of the initial string available, this rule yields two 
copies of hip(w). 

We also observe that the |d-rule is a particular case of intermolecular rules (1) 
and (2), obtained by setting 52 = A. Moreover, the d I ad- rule can be simulated 
using intermolecular rules as follows. 

Let 5 = 5i(p, ri)52((?, T2)53(r3,p)54(r4, (7)55) be an MDS-IES descriptor to 
which djadp ^ is applicable - all other cases can be treated similarly. Then 

5 5i54(r4,g)5s -b [52(g, r2)53(rs, ri)] = 5i54(r4,g)55 -b [53(rs, ri)52(g, r2)] 

5i54(r4, r2)5s(r3, ri)5255 = dladp_^(w) . 

The following results is thus proved. 

Theorem 1. Let 5 be an MDS-IES descriptor having a successful reduction in 
the intramolecular model. If two copies of 5 are available, then 5 has a successful 
reduction in the intermolecular model. 

The following universality result has been proved in [8], see [7] for more 
details. 

Theorem 2. Any MDS-IES descriptor has a successful reduction in the in- 
tramolecular model. 

Theorems 1 and 2 give the following universality result for the intermolecular 
model. 
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Corollary 1. Any MDS-IES descriptor available in two copies has a successful 
reduction in the intermolecular model. 

5 Invariants 

In the following two examples we consider the actin I gene in S.nova and inves- 
tigate assembly strategies for this gene in the intra- and inter-molecular models. 

Example 4- Consider the actin gene in Sterkiella nova, see Fig. 1, having the 
MDS-IES descriptor 

(5 = (3,4)/i(4, 5)/2(6, 7)/3(5, 6)/4(7, 8)/5(9, e)h{i,2)Ij{b, 2)^(8, 9). 

Consider then an assembly strategy for <5, e.g., Jd4 dlad r^ g jd^ dlad o g hig hig: 

1^(5) = (3,5)/2(6,7)/3(5,6)/4(7,8)/5(9,e)/6(3,2)/7(6,2)/8(8,9) + [h], 
^.6(M4(<5)) = to(3, 7)13/2/4(7, 8)/5(9,e)/6(3,2)/7(6,2)/8(8, 9) + [h], 
ld7(^g(M4('5))) = (3, 8)/5(9,e)/6(3,2)/7(6, 2)78(8,9) + [7i] + [73/274], 
^ g(M7(^5.6(M4('J)))) = (3, e)7g(3, 2)77(6, 2)78/5 + [7i] + [73/274], 
hig( dlad 8 n(ld 7 (dlad r ,g(ld 4 ( 6 ))))) = ( 3 ,e)h( 3 .b)l 7 hh + [h] + [73/274], 
hl8(hig( dlad8 gnd7(dlad r , g(ld4(6)))))) = le.{e,b)l7hh + [7i] -b [7372/4]- 

Thus, the gene is assembled in the inverted order, placed in a linear DNA 
molecule, with the lES 7e preceding it and the sequence of lESs Irish suc- 
ceeding it. Two circular molecules are also produced: [7i] and [7372/4]. □ 

Example 5. Consider the same actin gene in Sterkiella nova with the MDS-IES 
descriptor 

6 = (3, 4)7i(4, 5)72(6, 7)73(5, 6)74(7, 8)75(9, e)7e(3, 2)77(6, 2)7s(8, 9) . 

Then 5 can be assembled in the intermolecular model as follows: 

(3. 4) 7i (4, 6)74(7, 8)7s(9, e)7e(3, 2)77(6, 2)7s(8, 9) + [72(6, 7)73] 

(3.4) 7i(4,6)/4(7,9) + [ 75(9, e)7e(3, 2)77(6, 2)7s] + [72(6,7)73] 

(3.6) 74(7,9) + [7i] + [75(9, e)7e(3, 2)77(6,2)78] + [72(6,7)73] 

(3.6) 7473/2(6,9) + [7i] + [75(9, e)7e(3, 2)77(6, 2)78] 

(3,9) + [7473/2] + [7i] + [75(9,e)/6(3, 2)77(6,2)78] 

(3.6) 76(3,2)77(6,2)78/5 + [7473/2] + [7i] • 

Since <5 is also available, the assembly continues as follows. Here, for a (circular) 
string T, we use 2 • r to denote t + t: 
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5 + 5 ^ (3, e)/6(3, 2)/7(6, 2)hh + hh{2, b)Iy{2, 3)h{e, 3) 

+ 2 • [74/3/2] + 2 • [ 7 i] 

^ (3,e)/6(3,6)/7(2,3)/6(e,3) + /5/8/7(/,2)/8/5 

+ 2 • [ 74 / 3 / 2 ] + 2 • [7i] 

^ (3, e)/6(3, 6 ) 77 / 8/5 + 75 / 8 / 7 ( 6 , 3)76(e, 3) + 2 • [ 74 / 3 / 2 ] + 2 • [7i] 

^ (3, e)7e + IJshib, 3)h{e,b)77hh + 2 • [/4/3/2] + 2 • [7i] 

Ie(e,b)l7lsh + Hlshib, e) + 2 ■ [/4/3/2] + 2 • [7i] 

= 2 • {lQ{e,b)l7lsh + +[/i] + [/3/2/4]) • 

Note that this intermolecular assembly predicts the same context for the as- 
sembled string, the same set of residual strings, and the same linearity of the 
assembled string as the intramolecular assembly considered in Example 4. □ 

It is clear from the above two examples, see also Examples 2 and 3, that the 
two models for gene assembly predict completely different assembly strategies for 
the same micronuclear gene. As it turns out however, the predicted final result 
of the assembly, i.e., the linearity of the assembled gene and the exact nucleotide 
sequences of all excised molecules, is the same in the two models, see [7] for 
details. The following is a result from [7], see also [10]. 

Theorem 3. Let 5 be an MDS-IES descriptor. If +1 and (p2 are any two suc- 
cessful assembly strategies for S, intra- or inter-molecular, then 

(1) if ipi{5) is assembled in a linear descriptor, then so is +2(5); 

(2) if +i(5) is assembled in a linear descriptor in orthodox order, then so is 

+ 2 ( 5 ); 

(3) The sequence of lESs flanking the assembled gene is the same in +i(5) and 
+ 2 ( 5 ); 

(4.) The sequence of lESs in all excised descriptors is the same in +i(5) and 
+ 2 ( 5 ); 

(5) There si an equal number of circular descriptors in +i(5) and +2(5). 
Example 6. Consider the MDS-IES descriptor 

5 = (To, 8)71(3,6)72(5,3)73(10,11)74(5, 8)75(11,6) . 

A successful assembly strategy for 5 in the intramolecular model is the following: 



1hto(< 5) = /3(3, 5)72(6, 3)7i(8, 11)74(5, 8 ) 75 ( 11 , e), 
^,ii(hiTo('^)) =/s(3, 5)72(6, 3)7i7574(5,e), 
dlad3 Adladg ;^] (hlYg(5))) =17lilT,lj2{b,e). 
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Thus, 6 is always assembled in a linear molecule, and no lES is excised during 
the assembly process, i.e., no Id is ever applied in a process of assembling S. 
Moreover, the assembled descriptor will always be preceded by the lES sequence 
/3/1/5/4/2 and followed by the empty lES sequence. □ 

Example 1 . Consider the MDS-IES descriptor 

5= (T0,8)/i(3,6)/2(5,3)/3(10,ll)/4(5,8)/5(ll,e) 

from Example 6. Then <5 can be assembled in the intermolecular model as follows: 

(5 ^ (T0,8)/i/3(10,ll)/4(5,8)/5(ll,e) + [/2(5,6)] 

— (To, 8)/ i /3(10, e) + [/4(5, 8)/5] + [/2(5,6)] . 

Since also (5 is available, the assembly continues as follows: 



5 + 5 (10, 8)/i/3(10,e) + (e, 10)13/1(8, 10) 

+ 2 - [/ 4 ( 5 , 8 )/ 5]+2 - [/ 2 ( 5 , 6 )] 

^ 13/1(8, 10) + (e, 8 )/i/ 3(10, e) + 2 • [/4(5,8)/s] + 2 • [/2(5, 6)] 

^ 13/1(8, e) + (e, 8 )/i/3 + [/4(5, 8)/5] + [/5(8, 5)/4] + 2 • [/2(5, 6)] 

^ 13/1/5/4(5,6) + (6,5)74/5/1/3 + [( 5 , 6 )/ 2 ] + Mb , 5 )] 

— ^ /3 /i/5 /4/2(6, e) + { e , b ) l 2 lil 5 lih 

= ■ I3lihhl2{b,e). 

Note that, again, we obtain the same context for the assembled string, the same 
set of residual strings, and the same linearity of the assembled string as the 
intramolecular assembly considered in Example 6. □ 
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Abstract. There are at least two points of view when representing ele- 
ments of F 2 »», the field of 2" elements. We could represent the (nonzero) 
elements as powers of a generating element, the exponent ranging from 0 
to 2" — 2. On the other hand, we could represent the elements as strings 
of n bits. In the former representation, multiplication becomes a very 
easy task, whereas in the latter one, addition is obvious. In this note, we 
focus on representing F 2 " as strings of n bits in such a way that the natu- 
ral basis (1, 0, . . . , 0), (0, 1, . . . , 0), . . ., (0, 0, . . . , 1) becomes self-dual. We 
also outline an idea which leads to a very simple algorithm for finding 
a self-dual basis. Finally we study multiplication tables for the natural 
basis and present necessary and sufficient conditions for a multiplication 
table to give FJ a field structure in such a way that the natural basis is 
self-dual. 



1 Introduction and Preliminaries 

The bit strings, and operations on them have an essential role in theoretical 
computer science. The reason for this is very evident: to compute is to operate 
finite sets and all finite sets can be encoded into binary strings. Therefore, all 
properties (subsets) of finite sets can be represented by Boolean functions, that 
is, by functions from bit strings to bits. 

The most difficult problems in theoretical computer science can be repre- 
sented in terms of Boolean functions. For instance, it is known that all Boolean 
functions can be represented by Boolean circuits [7], but far too little is known 
about the number of gates to implement those functions. Quite a simple counting 
argument shows that on n variables, most Boolean functions need gates to 
be implemented [7], and yet the best known lower bound is only linear in n. 

If the scalar multiplication and addition are defined bitwise, the bit strings 
of length n form a vector space of dimension n over the binary field. However, 
for many applications it would be helpful if the bit strings could have even a 
stronger algebraic structure. Numerous very good examples of such applications 
can be found in the theory of error-correcting codes [5] . 
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We will use notation Fg for the finite field of q elements, but we are going to 
concentrate only on the case g = 2". As a binary field we understand the two- 
element field F2. An n-dimensional vector space over F2 is denoted, as usual, 
by F2 . The characters of the additive group of F^ are well-known and easy to 
define (see [2], for example): for each element y = {yi,. . . , j/„) G F2, there exists 
a character Xy defined as 

Xy{x) = i-ir-y, 

where x ■ y = xiyi -|- . . . -I- Notice that x ■ y belongs to F2, but (—1)®'^ is 
interpreted in the most obvious way. 

It turns out that all functions F2 ^ C can be represented as linear combina- 
tions of the characters [2]. The characters, in fact, have even a more important 
role: there is an obvious way to introduce a Euclidean vector space structure 
for the set of functions F2 ^ C, and the characters form an orthonormal basis 
of that Euclidean space. This role of the characters allows us to use discrete 
analogues of Fourier analysis, see [1] and [3] for instance. 

To introduce more algebraic structure in F2 , it is always possible to define 
(usually in many ways) the multiplication in F2 in such a way that F2 becomes 
the field F2», extension of F2 of degree n [5]. 

The trace of an element a G F2>» over the prime field F2 is defined as 

Tr(o:) = (X -t- . . . -t- (x^ , 



and it is a well-known fact that always Tr(o;) G F2, and that Tr : F2*» ^ F2 
is a linear mapping satisfying |Ker(Tr)| = 2"“^. As an easy consequence of the 
definition, we have also that Tr(a^) = Tr(a) for each a G F2«. 

As a vector space over F2, field F2« has an n-element basis B = {ai , . . . , «„} 
over F2. We say that the basis B is self-dual, if 



Tr{ataj) 



1, if z = 7, and 
0, if z yf j. 



For some practical applications, bases of special types are valuable. For instance, 
multiplication in a bit-string representation is in some sense “computationally 
cheap” if the chosen basis is so-called normal basis [6]. In some other situations, 
a self-dual basis is extremely welcome. Consider, for example, the characters of 
the additive group of F2»». As it is well-known, they all are of form 



where y G F2>* [5]. Assuming that the basis {ai,...,o;„} is self-dual, we can 
represent 

X = X±(Xi -t- . . . -t- XjiCXji 



and 



y = yicti -I- ... -I- ynctn, 
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where Xj, G F2. Using this representation we can find out that 

n n n n n 

Tr{xy) = TrC^x^ai'^yjaj) = ^ ^ Tr(a*aj) = '^Xiy^. 

i —1 3 — ^ i —1 i —1 

But then 

^y{x) = {-iry^+-+^-y\ 

which is to say that the character value (character determined by the element 
y = yiai + . . . + ynOCn) on element x = x\ai + . . . + x„a„ is exactly the same as 
the corresponding character value in the additive group F2 . 



2 Finding a Self-Dual Basis 

In this section, we outline the idea of a very simple algorithm to find a self-dual 
basis of F2". It seems that similar ideas were already present in [4]. 

The most crucial observation for the algorithm is the following. 

Lemma 1. Matrix 

/100\ 

0 0 1 G F^''^ 

VO 10/ 

can he diagonalized. 

Proof. A straightforward calculation shows that 

/I 0 1\ /I 0 0\ /I 1 1\ /I 0 o\ 

110 001 011=010 
\i 1 1 / yo 1 0 / yi 0 1 / yo 0 1 / 

and that the diagonalizing matrix is indeed invertible. □ 

Let now B = {ai,...,a„} be any basis of F2»»/F2, and define a matrix 
M G F”""" as 

M^j = Tr{aiaj). 

Clearly M is symmetric. Matrix M is also invertible, for otherwise its rows would 
be linearly dependent, and therefore we could find elements ci, . . ., c„ G F2 not 
all zero such that 

n 

'^c^TT{a^aj) = 0 

for each j. But this would mean that there would be an element 7 = ciOi -I- 
. . . -I- CnOin G F2 such that Tr(7aj) = 0 for each basis element aj. It would follow 
that mapping a Tr(7a) is identically zero, which implies that 7 = 0. But this 
would contradict the fact that there is a nonzero element c^. 

If M is a diagonal matrix, then B is already a self-dual basis (since M is 
invertible, there cannot be zeros in the diagonal). Assume then that M is not 
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diagonal. Since B is a basis, not all the diagonal elements are 0, for otherwise 
Tr(a|) = Tr(aj) = 0 for each basis element a^, which would contradict the 
property |Ker(Tr)| = 2”“^. 

Because there is at least one nonzero element in the diagonal, there exists an 
invertible matrix Ei S such that EiME^ can be written as 

where Mi G is symmetric. 

Now matrix Mi must have nonzero elements, for otherwise M would not be 
invertible. If Mi has a nonzero diagonal element, we can again find a matrix 
£^2 G F”""" such that 

0 

E2EiMEfE^ =01 

V 0 M2 

where M 2 G F 2 " is a symmetric matrix. 

If Ml does not contain any nonzero element in its diagonal, we can find, 
because Mi is symmetric, an invertible matrix E '2 G such that 

0 

E'^MiE'J =10 

V 0 M 2 

where M 2 is an (n — 3) x (n — 3)-matrix. Matrix E '2 can be straightforwardly 
extended to an invertible n x n-matrix £2 which satisfies 





£2£iM£f£j 



0 






/ 1 0 0 
0 0 1 
0 1 0 

V 0 M 2 J 



On the other hand, by Lemma 1, the 3 x 3-matrix in the left upper corner is 
diagonalizable, which implies that there exists an invertible matrix £3 G F^^” 
such that 



E3MEI 



/ 1 0 0 
'010 






0 

0 0 1 

0 M 2 J 



Continuing the same reasoning, we can eventually find an invertible matrix 
£ G F 2 ^” such that EME^ is diagonal. Since EME'^ is invertible, necessarily 
EME"’" = / is the identity matrix. 

Now that £ is invertible, set {(3i , . . . , /3„} defined by 



n 

f^i — ^ ^ ^ik^k 
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is also a basis of F2"/F2, and 

n n n n 

PiPj = X/ ^ Ejiai = EE EikEjiakon- 

k=l 1=1 k=l 1=1 

It follows that 

n n 

Tr(A/3,) = EE EikEjiTr{akai) 

k^l 1^1 
n n 

= EE EikMkiEji 

k^l 1^1 

= {EME%j = I,j, 

which proves that basis {/3 i, . . . , is self-dual. 

3 Multiplication Tables 

Assume now that is a self-dual basis of F2"/F2, and choose a coor- 

dinate representation ei = (1,0 ,..., 0), 62 = (0, 1, ... , 0), . . ., e„ = (0, 0, . . . , 1) 
for the basis elements a\, 02 ■ • Moreover, define matrices € 

F2 by condition 

n 

ai-aj = '^c\fak- ( 1 ) 

k=l 

Now that Bi is merely a renaming of ai, we can build up a multiplication 
operation in F2 by first defining 

n 

= E 

k=l 

and then extending this to be a bilinear mapping F2 x F2 ^ F^. This clearly 
defines a field structure for F^ in such a way, that the vectors ei, . . ., e„ of the 
natural basis form a self-dual basis of F2 . 

It is clear that the matrices are symmetric. Examples of multiplication 
tables of F| and F| are shown in Figures 1 and 3. The corresponding matrices 
are shown in Figures 2 and 4, respectively. 

As mentioned above, the matrices found this way are symmetric, but even 
more interesting symmetries can be found. Consider, for instance, Figure 1 and 
notice that, by the definition, matrix is formed by taking the first columns 
which are under vectors (1,0,0), (0,1,0), and (0,0,1) (which are above the 
horizontal line) , respectively. Again by definition, matrix is formed by taking 
the second columns under those vectors, and by picking up the last columns. 
Interestingly, the same matrices and can be obtained by directly 

forming 3 x 3-matrices of the row vectors which lie under vectors (1, 0, 0), (0, 1,0), 
and (0, 0, 1) (which are above the horizontal line). 
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( 1 , 0 , 0 ) ( 0 , 1 , 0 ) ( 0 , 0 , 1 ) 


( 1 , 0 , 0 ) 

( 0 , 1 , 0 ) 

( 0 , 0 , 1 ) 


( 0 , 1 , 0 ) ( 1 , 0 , 1 ) ( 0 , 1 , 1 ) 
( 1 , 0 , 1 ) ( 0 , 0 , 1 ) ( 1 , 1 , 0 ) 
( 0 , 1 , 1 ) ( 1 , 1 , 0 ) ( 1 , 0 , 0 ) 



Fig. 1. Multiplication table of Ff 



/O 1 0\ /I 0 1\ /O 1 1\ 

= 1 0 1 I , = 0 0 1 I , C® = 1 1 0 I 

\01l/ \110/ \100/ 



Fig. 2. Matrices corresponding to the multiplication table of F| 





( 1 , 0 , 0 , 0 ) ( 0 , 1 , 0 , 0 ) ( 0 , 0 , 1 , 0 ) ( 0 , 0 , 0 , 1 ) 


( 1 , 0 , 0 , 0 ) 
( 0 , 1 , 0 , 0 ) 
( 0 , 0 , 1 , 0 ) 
( 0 , 0 , 0 , 1 ) 


( 1 , 1 , 0 , 1 ) ( 1 , 0 , 0 , 1 ) ( 0 , 0 , 1 , 1 ) ( 1 , 1 , 1 , 1 ) 
( 1 , 0 , 0 , 1 ) ( 0 , 1 , 1 , 1 ) ( 0 , 1 , 1 , 0 ) ( 1 , 1 , 0 , 0 ) 
( 0 , 0 , 1 , 1 ) ( 0 , 1 , 1 , 0 ) ( 1 , 1 , 1 , 0 ) ( 1 , 0 , 0 , 1 ) 
( 1 , 1 , 1 , 1 ) ( 1 , 1 , 0 , 0 ) ( 1 , 0 , 0 , 1 ) ( 1 , 0 , 1 , 1 ) 



Fig. 3. Multiplication table of F| 



/I 101\ 




f 1 0 0 1\ 


10 0 1 


, = 


0 111 


0 0 11 


0 110 


Villi/ 




yl 100/ 


/O 0 1 n 




/1111\ 


0 110 


, cW = 


110 0 


1110 


10 0 1 


\1 0 0 ly 




VlOl 1/ 



Fig. 4. Matrices corresponding to the multiplication table of Ff 



Another interesting property is that the matrices C^'^\ and sum 

up to the identity matrix. A more peculiar feature is that one of the matrices 
(in this example generate a multiplicative group of order seven, and that 

group augmented with the zero matrix forms the field of eight elements. The 
following theorem and its proof clarify the phenomena mentioned above. 

Theorem 1. Let {ai, . . . , «„} be a self-dual basis o/F2« /F2, and matrices c[^'^ 
defined as in ( 1 ). Then the following conditions hold: 

1 . ^ for each i,j,k. 

2 . // (tti, . . . , a„) 7^ ( 0 , . . . , 0), then det(aiC'^^) -h . . . -I- 7^ 0. 
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3. = C^3)c(^) for each i,j. 

I + . . . + = 1. 

5. for each i G {1, ... ,n}. 

Proof. Condition 5 is clear by the definition of the matrices Taking the 
traces of the both sides of the equation 

n 

(2) 

gives 

n n 

Tr{a,aj) = ^ C^^^Tr{ai) = ^ 

which proves 4. Multiplication of (2) by ak and taking the traces gives 

n 

Tr(aiajak) = ^ Tr(a/afc) = 

which shows that the condition 1 is satisfied. In fact, the above equation shows 
directly that when referring to the matrix element we can permute i, j, 

and k in any way. 

We can now show that the matrices (7^*^ in fact generate a matrix repre- 
sentation of the field F 2 »» (the zero matrix must, of course, be included, too): 
Any mapping F 2 *» ^ F 2 *» defined by rule x ax is linear, and can therefore be 
also considered as a linear mapping F^ ^ F^ . The matrix of the linear mapping 
F 2 ^ F 2 corresponding to element a is known as a matrix representation of 
a G F 2 ". It is a well-known fact that the matrices found in this way also form 
field F 2 »» with respect to the ordinary matrix sum and multiplication. The set of 
matrices found in this way is called a matrix representation of field F 2 »». In what 
follows, we use basis {a\, . . ., a„} to present those matrices. 

Let us fix an element a = X)i=i G F 2 »». Then for each basis element aj 
we have 

n n n n n 

i— 1 i—1 k—1 k—1 i—1 

which shows that matrix X)r=i is in fact a matrix representation of element 

X^r=i Conditions 2 and 3 follow now immediately. □ 

Also, a “converse” of the above theorem can be shown true. 

Theorem 2. Let , . . ., G F 2 be matrices which satisfy the conditions 
1-5 of the previous theorem. If the multiplication in F^ is defined as 

n 

fe=l 

and extended to he a bilinear operation, then Ytf is becomes a field, and the 
natural basis is self-dual. 
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Proof. The additive structure of F 2 is trivial, so we only have to consider the 
multiplicative structure. That the distributive law holds, is straightforward. By 
the condition 5 we have that 

n n 

H = ej ■ e„ 

fc=i fc=i 

meaning that the basis elements form a commuting set. It follows directly that 
all elements of F 2 commute in multiplication. 

It follows from the properties 1, 5, and 4 that 

n n n n n m 

( ^ 6i) • ^ e, • ^ ^ ^ ^ C'jfe = e^-, 

2=1 2=1 2=1 k — 1 k — 1 2=1 

which shows that e\ + . . . + Bn is the unit element with respect to the multipli- 
cation. 

A direct calculation shows that 

n n n 

(e, -Bj)-Bk = J2 • efe = 



Similarly, 

n n 

Bi ■ {Bj -Bk)=J2J2 

s=l r=l 

To prove (ej ■Bj)-Bk = Bi- (bj • e^) it remains to show that for each s G { 1, . . . , n}, 
equation 

n n 

^rk — 2^ ^jk '-‘ir 
r=l r—1 

holds. But conditions 1 and 5 imply that the left hand side of (3) is equal to 

n 

r—1 

and that the right hand side is equal to 

n 

r—1 

because of the condition 3. It follows that the multiplication is associative. 

It remains to demonstrate that each nonzero element has an inverse. For that 
purpose, assume that an element a = i® nonzero. To find the inverse 

a; of a, we have to solve the equation 



a ■ X = I = Bi + . . . + Bn, 
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which can be written as 






i—1 j=l k—1 



n 



^efe, 



or, equivalently as 

n n 

^ ^ aiXjC^j^ = 1 
i=i j=i 

for each k. Using again 1 and 5, the above can be written as 

n n 

= 1 - ( 4 ) 

1=1 *=i 

Now that (oi, . . . , a„) (0, . . . , 0), matrix + . . . + a„C^"^ is invertible by 

condition 2, and hence the system of equations (4) is solvable. □ 



Remark 1. Condition 2 of Theorem 1 seems to be the most complicated one, 
but unfortunately it cannot be derived from the other conditions, even though 
the matrices were all invertible. To see why this holds, consider the following 



4 X 4-matrices 





/O 0 0 1\ 




/O 0 1 1\ 


= 


0 0 11 




0 0 10 


0 10 1 


1111 




Uii V 




i^l 0 1 0/ 




/O 1 0 1\ 




/1111\ 


= 


1111 


, = 


10 10 


0 10 0 


110 0 




i^l 1 0 0^ 




i^l 0 0 0/ 



which are all invertible. A direct calculation shows that they satisfy all conditions 
but the second one, since det(C'^^) -I- = 0. This proves that the above 

matrices cannot be used to create a field structure to F^. On the other hand, 
the proof of Theorem 2 shows that the above matrices induce a commutative 
ring structure in F|, since condition 2 was used only to prove the existence of 
inverses of non-zero elements. 
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Abstract. We give faster algorithms for two methods of reducing the 
number of states in nondeterministic finite automata. The hrst uses 
equivalences and the second uses preorders. We develop restricted re- 
duction algorithms that operate on position automata while preserving 
some of its properties. We show empirically that these reductions are ef- 
fective in largely reducing the memory requirements of regular expression 
search algorithms, and compare the effectiveness of different reductions. 



1 Introduction 

Regular expression handling is at the heart of many applications, such as lin- 
guistics, computational biology, pattern recognition, text retrieval, and so on. 
An elegant theory gives the support to easily and efficiently solve many complex 
problems by mapping them to regular expressions, then obtaining a nondeter- 
ministic finite automaton (NFA) that recognizes it, and finally making it deter- 
ministic (a DFA). However, a severe obstacle in any real implementation of the 
above scheme is the size of the DFA, which can be exponential in the length of 
the original regular expression. 

Although a simple algorithm for minimizing DFAs exists [5] , it has the prob- 
lem of requiring prior construction of the DFA to later minimize it. This can be 
infeasible because of main memory requirements and construction cost. 

A much more promising (and more challenging) alternative is that of directly 
reducing the NFA before converting it into a DFA. This has the advantage of 
working over a much smaller structure (of size polynomial in the length of the 
regular expression) and of building the smaller DFA without the need to go 
through a larger one first. 

However, the NFA state minimization problem is very hard (PSPACE-com- 
plete, [10]) and therefore algorithms such as [11, 13, 14] cannot be used in prac- 
tice. There are also algorithms which build small NFAs from regular expressions, 
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see [7,4], but they consider the total size, that is, they count both states and 
transitions, and they increase artificially the number of states to reduce the num- 
ber of transitions. As the implementation crucially depends on the number of 
states, such algorithms may not help. 

The approach we follow is reducing the size of a given NFA. The idea of 
reducing the size of NFAs by merging states was first introduced by Hie and Yu [8] 
who used equivalence relations. Later, Champarnaud and Coulon [2] modified 
the idea to work for preorders. In this paper we give fast algorithms to compute 
these two reductions. We show that the algorithm based on equivalences can be 
implemented in O(mlogn) time on an NFA with n states and m transitions, 
while that based on preorders can run in 0{mn) time. Both results improve the 
previous work. 

When starting from a regular expression, the initial NFA, which we want 
to reduce, is the position automaton. Navarro and Raffinot [17, 18] showed that 
its special properties permit a more compact DFA representation. Our modified 
reductions are restricted to preserve those properties and hence may produce 
NFAs with more states than the original reductions. 

Finally, we empirically evaluate the impact of the reduction algorithms. We 
show that the number of NFA states can be reduced by 10%-40%. Those reduc- 
tions translate into huge reductions in the DFA size, with factors of up to 10“®. 
We also compare the alternatives of full reduction versus restricted reduction, 
since the former yields less NFA states but the latter permits a more compact 
DFA representation. The results show that full reduction is preferable in most 
cases of interest. 

2 Basic Notions 

We recall here the basic definitions we need throughout the paper. For further 
details we refer to [6] or [22]. 

Let A be an alphabet and A* the set of all words over A; e denotes the empty 
word. A language over A is a subset of A*. A nondeterministic finite automaton 
{NFA) is a tuple M = {Q, A, 6, 1, F), where Q is the set of states, I C Q is the 
set of initial states, F C Q is the set of final states, and S : Q x A ^ 2'3 is the 
transition mapping; S is extended to 6 : 2^ x A* ^ 2^ by S{S, a) = Uqes Hd, a) 
and 6{S,e) = S, S{S,aw) = 6{6{S, a),w), for S' C Q, w G A*. The language 
recognized by M is L{M) = {w G A* j S{I,w) C\F For a state q € Q,we 
denote 

ZL{M,q) = {w G A* 1 g G i5(/,w)}, 

^r{M, q) = {w € a* \ 6{q, w) H F yf 0}; 

when M is understood, we write simply Li(g) and Ln{q), resp. The reversed 
automaton of M is M’’ = {Q, A, 6^, F, /), where q G 6^{p, a) iff p G 6{q, a). 

3 NFA Reduction with Equivalences 

The idea of reducing the size of NFAs by merging state was investigated first by 
Hie and Yu [8]; see also [9]. We describe it briefly in this section. 




114 



Lucian Hie, Gonzalo Navarro, and Sheng Yu 



Let M = (Q, A, S, I, F) be an NFA. We define =_r as the coarsest equivalence 
relation over Q that satisfies: 

(Pi) =^n(Px(g-p)) = 0, 

(P 2 ) for any p,q G Q,a G A, {p=nq^ Vq' G 6{q,a),3p' G S{p,a),q' =r p'). 

The equivalence =r is the largest equivalence over Q which is right-invariant 
w.r.t. M; see [8,9]. Given =r, the algorithm to reduce the automaton M using 
it is trivial: simply merge all states in the same equivalence class and modify the 
transitions accordingly. Here is an example. 

Example 1. The NFA in Fig. 4 is reduced using =r as shown in Fig. 1; the 
equivalence classes are also shown. 



classes of =_r: {0} 

{1,2,3,4,5,61 




a, b 



Fig. 1. Ar(t) = Apos(r)/=^ for r = (a -|- b){a* + ba* -|- b*)* 



Symmetrically, the relation =r can be defined using the reversed automaton. 
The automaton M can be reduced according to either equivalence. As examples 
in [9] show, M can be reduced more using both equivalences but the problem of 
finding the best way to do the reduction is open. Fig. 2 gives an example (from 
[9]) where there is no unique way to reduce optimally using both =r and =l- 




4 Computing Equivalences 

The algorithm in [8] for computing =r runs in low polynomial time but the 
problem of finding a fast algorithm was left open. We show here that an old very 
fast algorithm of Paige and Tarjan [19] can be used to solve the problem. 

Recall some definitions from [19]. For a binary relation E over a finite set 
U we denote, for any subset S C U, E~^{S) = {x | S S' such that xEy}. A 
subset B CU is called stable w.r.t. S if either B C E~^{S) or P n E~^{S) = 0. 
A partition P of [/ is stable w.r.t. S if all the blocks of P are stable w.r.t. S. 
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P is stable if it is stable w.r.t. each of its own blocks. The relational coarsest 
partition problem is that of finding, for a given relation E and a partition P 
over a set U, the coarsest stable refinement of P. Paige and Tarjan [19] gave an 
algorithm for this problem which runs in time O(mlogn) and space 0(m + n), 
where n = card(C/), m = card (if). They remarked that the algorithm works also 
for several relations. 

This algorithm applies to our problem of finding =n as follows. For any 
a G A, denote da = {{p,q) G Q xQ \ q G S{p,a)}. Then =n is the coarsest stable 
refinement of the partition {F, Q — F’} w.r.t. all relations 6a, a G A. 

Therefore, if the number of states in our automaton is n and the number of 
transitions is m, we have the following theorem. 

Theorem 1. The equivalences =r and =l can be computed in time O(mlogn) 
and space 0(m + n). 

It is interesting to notice that, to reduce NFAs by equivalences, we employed 
an idea from deterministic finite automata (DFA) reduction and then, to make it 
fast, we used an algorithm which was inspired itself from Hopcroft’s algorithm [5] 
to reduce DFAs. 

5 NFA Reduction with Preorders 

Champarnaud and Coulon [2] noticed that a better reduction can be obtained 
if the axioms (Pi) and (P 2 ) above are used to construct a preorder relation 
instead of an equivalence. Let us denote the largest (w.r.t. inclusion) preorder 
which verifies (Pi) and (P 2 ) by C^j. It is then immediate that p Cji q implies 
^r{p) C Eii{q). 

As in the case of equivalences, the relation is symmetrically defined using 
the reversed automaton. Then, p q implies El{p) C ^^(g). 

The reduction with preorders is more complicated than with equivalences. 
First, we can merge two states p and q as soon as any of the following conditions 
is met: 

(i) P^Rq and q p, 

(ii) P^Lq and q Cl P, 

(iii) P^Rq and p Cl q. 

However, after merging two states, the preorders C^ and Cl must be updated 
such that their relation with the languages L/j and Ll (see above) is preserved. 
For instance, in the case (i), assuming the merged state ofp and q is denoted q, 
the update amounts to removing from all pairs {q, s) for which p s. Case 
(ii) is handled similarly and (iii) does not need any update. 

An open problem here is how to merge the states using the two preorders 
such that the reduction of the NFA is optimal; see the example in Fig. 2. 

Since the preorder requirement is weaker than equivalence, p =li q implies 
that p Cfi q and q Cjip. The converse is not true in general (see [2] for an exam- 
ple). Therefore, using preorders we have a chance to obtain a better reduction 
of the NFA. It remains to investigate how much better. 




116 



Lucian Hie, Gonzalo Navarro, and Sheng Yu 



6 Computing Preorders 

We give here an algorithm to compute the preorders and C^. Assuming that 
Q = {1,2, ...,n} and the number of transitions is m, our algorithm runs in 
time 0{mn) and space O(n^). The best algorithm given by Champarnaud and 
Coulon [2] runs in time O(mn^). 

We shall compute the complement of Qr by the algorithm preorder(M) 

from Fig. 3; u) is the relation which is at the end. According to the definition 

of C^, its complement is the smallest relation over Q such that 

(P{) (F X (g - F)) 

(P 2 ) for any i,j G Q,a G A, (Bi' G S(i,a),Vf G S{j,a),i' j' i %rJ)- 

So, we add {i,j) to based on the fact that there is i' G 5{i,a) for which the 
number of those j' G S{j,a) with i' %r j' is precisely card((5(j, a)); that is, all 
j's. Therefore, we shall compute some matrices of counters N{a), for any a G A; 
N (a) is a n X n matrix such that 

N{a)ij = card({^ G 6{j,a) \ i %r £}), 

for alH,j G g. We start with all these counters set to zero and update them 
anytime there is new information on %r] any new pair added to %r is en- 
queued (steps 9 and 19) and later dequeued (step 11) and processed such that 
all counters involved are adequately updated (step 14) . Anytime such a counter 
N{a)ik reaches maximum value card(i5(A:, a)) (step 15), all pairs {j,k) such that 
i G S{j,a) are added to %r if not already there (steps 16-18). 

Let us show that the algorithm preorder(M) computes correctly the pre- 
order %R. First, it is clear that %r is obtained by adding all pairs in (P{) and 
then using (P 2 ) as long as pairs can still be added. Assume then 

%R— {(A ) Jl) ) ■ ■ ■ J {pijr ) ) ■ ■ ■ ) (*S) Js)} ) 

where the first r pairs are added because of (P{) and the remaining ones due to 
(P^)- We show that the algorithm preorder(M) computes the same relation. 
Denote by co the relation computed by the algorithm. Obviously, co Q^r. Assume 
there is in but not in u; consider such a pair with the lowest index t. 

It must be that t > r since all pairs in F x (g — F) are certainly added to uj. As 
(itjjt) is in <^R, there must be an i' G S{it,a) such that was added to %r 

because all pairs j' G S{jt,a), were already in <^r. Thus, at least one of 

those pairs is not in oj. Since the index of is strictly smaller than 

t, a contradiction is obtained. 

The time complexity of the above algorithm is 0{m+n^) for the preprocessing 
and proportional to 

n n 

^ ^(card((5’'(i, a)) + card(<5’'(j, a))) = 0{mn) 

i—1 j—1 a^A 

for processing. Therefore, the time complexity is 0{mn). The space complexity 
is O(n^). We have proved the following theorem. 
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preorder(M) 

- given: an NFA M 

- returns: 



1 . 

2 . 

3. 

4. 

5. 

6 . 

7. 

8 . 
9. 

10 . 

11 . 

12 . 

13. 

14. 

15. 

16. 

17. 

18. 

19. 

20 . 



for q £ Q,a £ A do 

compute S’’{q,a) as a linked list 
compute card(5(g, a)) 
initialize all N{a)s with Os 
o) ^ 0, e ^ newqueue() 
for i £ F do 

for j £ Q — F do 
o) ^ o) U 

enqueue(C, (i,j)) 
while 6 7 ^^ 0 do 

{i,j) ^ dequeue(C) 
for a £ A do 

for k € 5'^{j,a) do 

N{a)ik ^ N{a)ik + 1 
if N{a)ik ~ card{S{k,a)) then 
for j £ a) do 

if [j, k) ^ u) then 
Lo A:)} 

enqueue(C, (j, k)) 



1 1 \-F. preprocessing 



//5-19: processing 
1 initialize u) 
//o) will be gfl at the end 



//11-19: updates due to 
/ / (i, j) being added to w 

//14: update counter 
//15-18: update oj 
/ /when a counter is maximal 



return u> 



Fig. 3. Algorithm for computing preorders 



Theorem 2. The preorders and Cl can be computed in time Q{mn) and 
space O(n^). 

7 Position Automaton 

We recall in this section the well-known construction of the position automaton^, 
discovered independently by Glushkov [3] and McNaughton and Yamada [12]. 

Let a be a regular expression. The basic idea of the position automaton is to 
assume that all occurrences of letters in a are different. For this, all letters are 
made different by marking each letter with a unique index, called its position in a. 
The set of positions of a is pos(a) = {1,2, ... , jaU}, where \a\A is the number 
of letter occurrences in a. We shall denote also posg(o:) = pos(a) U {0}. The 
expression obtained from a by marking each letter with its position is denoted 
a £ A , where A = {ai | o G A, 1 < f < |q;|a}- For instance, if a = a{baa + b*), 
then a = 01(620304 -b 65). Notice that pos(o:) = pos(a). The same notation is 
also used for unmarking, that is, a = a. 

Three mappings first, last, and follow are then defined as follows (see [3]). For 
any regular expression a and any i £ pos(o;), we have: 



® This automaton is sometimes called Glushkov automaton', e.g., in [18]. 
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first(a) = {i I ttiW € 'C(a)}, 

last(a) = {i I wai G L(a)}, (1) 

follow(a,i) = {j I uaiGjV G il(a)}. 

We extend follow by follow(a, 0) = first(a). Also, let lasto(a) stand for last(a) if 
e ^ iL(a) and last(a) U {0} otherwise. 

The position automaton for a is 

Apos(a) = (poSo(a),A,(5pos,0, lasto(a)) 



where 

(5pos = {(*,a, j) I j G follow(a,z),a = oj}. 

Besides the property of accepting the language expressed by the original regular 
expression, that is, G(Apos(Q;)) = 'C'(o;), the position automaton has two very 
important properties. First, the number of states is always |a|yi + 1, which makes 
it work better then Thompson’s automaton [20] in bit-parallel regular expression 
search algorithms [18]. Second, all transitions incoming to any given state are 
labelled by the same letter, a property exploited by Navarro and Raffinot [17, 18] 
in regular expression search algorithms to represent the DFA using -I- 1 A|) 

bit-masks of length |a|A, rather than 0{2^°"^^\A\). 

Example 2. Consider the regular expression r = (a -I- b){a* + ha* + b*)* . The 
marked version of r is T = (ai -|- 62 )(a 3 - 1-6405 -|- 6 g)*. The values of the mappings 
first, last, and follow for r and the corresponding position automaton Apos(T) 
are given in Fig. 4. 



a 




The position automaton can be computed easily in cubic time using the 
inductive definitions of first, last, and follow, but Briiggemann-Klein [1] showed 
how to compute it in quadratic time. 
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8 Reducing the Position Automaton 

In this section we show how the position automaton can be reduced using equiv- 
alences and/or preorders such that its essential properties are preserved. 

Consider a regular expression a and define the equivalence over posQ(a) 
by i 3 iff the letter labelling all transitions incoming to i is the same as the 
one for j . 

The idea is to reduce the position automaton such that the transitions in- 
coming to a given state are still labelled the same. Therefore, any states we 
merge must be in Using equivalences, say =r, we merge according to the 
equivalence =r C Fig. 5 shows an example. 



classes of =r Pi {0} 

{1,3,5} 

{2,4,6} 



Fig. 5. Apos(r)/=„n~, for r 



a 




Using preorders, we do just as before with the restriction imposed by 



9 Experimental Results 

In this section we aim at establishing how significant is the reduction obtained 
using equivalences, and its relevance to regular expression search algorithms. 
In particular, we are interested in comparing two choices: full right- equivalence, 
where the properties of the position automaton are not preserved (Section 3), 
and restricted right- equivalence, where those properties are preserved (Section 8). 
While full right-equivalence can potentially yield larger reductions in number of 
NFA states, it requires a representation of 0(2"^ |A|) cells for the DFA (n/ is the 
number of reduced NFA states, A is the alphabet). Restricted right-equivalence 
may yield more states, but permits a more compact representation in 0(2”’'-|-|A|) 
cells for the DFA (ur is the number of NFA states after the restricted reduction). 

We have tested regular expressions on DNA, having 10 to 100 alphabet sym- 
bols, averaging over 10,000 expressions per length, with density of operators 
from 0.1 to 0.4 (Section 9.1 gives more details on the generation process). For 
each such expression, we built its position automaton (Section 7), and then ap- 
plied full and restricted reduction. Figure 6 shows the reductions obtained, as 
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a fraction of the original number of states (which was always n = 1 + |q;|a be- 
cause of Glushkov’s construction). There is a second version of the full reduction, 
where the NFA is previously modified to include a self-loop at the initial state, 
for search purposes. This permits no further restricted reduction, but allows a 
slightly better full reduction, as it can be seen. Both reduction factors tend to 
stabilize as n grows, being better for higher density of operators in the regular 
expression. It is also clear that full reduction gives substantially better reductions 
compared to restricted reduction (10%-20% better). 



Density of operators: 10% 




Density of operators: 20% 




Number of characters 



Number of characters 



Density of operators: 30% 



Density of operators: 40% 
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10 20 30 40 50 60 70 80 90 100 

Number of characters 



Fig. 6. Reduction factors in number of states obtained over position automata built 
from regular expressions of lengths 10 to 100 and density of operators from 0.1 to 0.4, 
built from DNA text 



As explained, the above does not immediately mean that full reduction is 
better, because its DFA representation must have one table per alphabet letter. 
Figure 7 shows the reduction fraction in the representation of the DFA, com- 
pared to that of the position automaton. This time the difference between the 
search and the original automaton are negligible. As it can be seen, the restricted 
reduction is convenient only for n < 10 to n < 30, depending on the operator 
density. Note, on the other hand, that those short expression lengths imply that 
even the original position automaton is not problematic in terms of space or 
construction cost. That is, full reduction becomes superior precisely when the 
space problem becomes important. 
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Density of operators: 10% 



Density of operators: 20% 





Density of operators: 30% Density of operators: 40% 





Number of characters Number of characters 



Fig. 7 . Reduction factors in DFA sizes obtained over position automata built from 
regular expressions of lengths 10 to 100 and density of operators from 0.1 to 0.4, built 
from DNA text 



Figure 8 shows the same results in logarithmic scale, to show how large are 
the savings due to full reductions when n becomes large. The second version of 
NFA (for searching) is omitted since the difference in DFA size is unnoticeable. 

As noted in [17,18], DFA space and construction cost can be traded for 
search cost as follows: A single table of 0(2”) entries can be split into k tables 
of size 0(2”/^) each, so that each such table has to be accessed for each text 
character in order to build the original entry. Hence the search cost becomes 
0{k) per text character. If main memory is limited, a huge DFA actually means 
larger search time, and a reduction in its size translate into better search times. 
We have computed the number of tables needed with and without reductions 
assuming that we dedicate 4 megabytes of RAM to the DFA. For the highest 
operator density we have obtained speedups of up to 50% in search times, that 
is, we need 2/3 of the tables needed by the original automaton. 



9.1 Generating Regular Expressions 

The choice of test patterns is always problematic when dealing with regular 
expressions, since there is no clear concept of what a random regular expression 
is and, as far as we know, there is no public repository of regular expressions 
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Density of operators: 10% 



Density of operators: 20% 
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Fig. 8. Reduction factors in DFA sizes obtained over position automata built from 
regular expressions of lengths 10 to 100 and density of operators from 0.1 to 0.4, built 
from DNA text. Note the logarithmic scale 



available, except for a dozen of trivial examples. We have chosen to generate 
random regular expressions as follows: 

1. We choose a base real-world text, in this case DNA from Homo Sapiens. 

2. We choose n and pick a random text substring of length n. 

3. We choose an operator density 0 < 7 < 1. 

4. We apply a recursive procedure to convert a string of length i into a regular 

expression: 

(a) An empty string is converted into an empty regular expression. In the 
rest, we assume a nonempty string. 

(b) With probability 1 — 7 we choose that the expression will be the con- 

catenation of two subexpressions: a left part of I' characters and a right 
part oi I — i' characters, where I' is chosen uniformly in the range 
1 < ^ — 1 . We recursively convert both subparts into regular expres- 

sions Cl and 62 . The resulting expression is ei • 62 . If = 1 we simply 
write down the string character. 

(c) Otherwise, if the parent in the recursion has just generated a Kleene 
closure operator we choose to add a union operator if not, we 
choose with the same probability between a Kleene closure and a union. 
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(d) If we chose that the expression will have a union operator, we choose a 
left part of I' characters and a right part of ^ characters, where I' 
is chosen uniformly in the range Q < i' < 1. We recursively convert both 
subparts into regular expressions e\ and e^. The resulting expression is 
ei|e2. 

(e) If we chose to add a Kleene closure operator at the end of the string, 
we recursively generate a regular expression ci for the string. The re- 
sulting expression is ei*. 

The above procedure is just one of the many possible alternatives to generate 
random regular expressions one could argue for, but it has a couple of advantages. 
First, it permits determining the length n (number of characters of A) in advance. 
Second, it takes the characters from the text, respecting its distribution. Third, 
it permits us to choose expressions with more or less operators by varying 7 . We 
show experiments with 7 = 0.10 to 7 = 0.40. Examples obtained from our tests, 
with n = 10, are “ACAT{T\e)TT * AG(T|e)” and M(GAT(e|T*) * ((e|e|T) * 
TA*) * (ejelGT))*”, respectively. 

10 Conclusion 

We have developed faster algorithms to implement two existing NFA reduction 
techniques. We have also adapted them to work over position automata while 
preserving their properties that allow a compact DFA representation. Finally, 
we have empirically assessed the practical impact of the reductions, as well as 
the convenience of preserving or not the position automata properties. 

Future work involves empirically evaluating the impact of using preorders 
instead of equivalences. The former are more complex and slower to compute, 
and it is not clear which is the optimal way to apply the different reductions, 
hence the importance of determining their practical value. 
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Abstract. In this paper, we provide some properties of classes of regular 
languages consisting of directing words of directable automata and some 
new results on the shortest directing words of nondeterministic directable 
automata. 



1 Introduction 

Let A be a nonempty finite set, called an alphabet. An element of X is called a 
letter. By X* , we denote the free monoid generated by X. Let A+ = X* \ {e} 
where e denotes the empty word of X* . For the sake of simplicity, if A = {o}, 
then we write a+ and a* instead of {a}+ and {a}*, respectively. Let L C A*. 
Then L is called a language over A. If L C A*, then L+ denotes the set of all 
concatenations of words in L and L* = L"*' U {e}. In particular, if L = {w}, then 
we write w“'" and w* instead of {w}’'" and {w}*, respectively. Let u G A*. Then 
u is called a word over A. If m G A*, then |m| denotes the length of u, i.e. the 
number of letters appearing in u. Notice that we also denote the cardinality of 
a finite set A by \A\. 

A finite automaton (in short, an automaton) A = (S', A, i5) consists of the 
following data: (1) S is a nonempty finite set, called a state set. (2) A is a 
nonempty finite alphabet. (3) i5 is a function, called a state transition function, 
of S X A into S. 

The state transition function S can be extended to the function of S x X* 
into S as follows: (1) (5(s, e) = s for any s G S. (2) (5(s, au) = S{6{s, a),u) for any 
s G S, a G A and u G A* . 

Let A = (S, A, (5) be an automaton, let s G S and let u G A*. In what 
follows, we will write sw^ instead of 6{s,u). 

A finite recognizer A = (S, A, 6 , sq, F) consists of the following data: (1) The 
triple (S, A, S) constitutes a finite automaton. (2) sq G S is called the initial 
state. (3) F C S is called the set of final states. 

Let A = {S,X,6,sq,F) be a finite recognizer. Then the language 7 (A) = 
{m G a* I S(so,u) G F} is called the language accepted by A. 

Let L C A*. Then h is said to be regular if L is accepted by a finite recognizer. 

Now we define an directable automaton. 

Definition 1. An automaton A = (S, A, d) is said to be directable if the follow- 
ing condition is satisfied: There exists w G A* such that sw^ = tw^ for any 
s,t G S. 
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In the above definition, a word w S X* is called a directing word of A. Then 
we have: 

Fact Let A = (S', X, S) be an automaton. Then A is directable if and only if for 
any s,t & S, there exists u G X* such that sw^ = tu'^ . 

Proposition 1. Assume that A = (S, X, i5) is a directable automata. Then the 
set of directing words T){A) of A is a regular language. 

Let A = {S,X,5) be a directable automaton. By d{A), we denote the value 
min{\w\ I w G D(yi)}. Moreover, d{n) denotes the value max{d{A) \ A = 
(S, X, (5) is a directable automaton with n states}. In the definition of d{n), 
X ranges over all finite nonempty alphabets. 

In [2], Cerny conjectured the following. 



Conjecture 1. For any n > 1, d{n) = {n — 1)^. 

However, the above problem is still open and at present we have only the 
following result: 

Proposition 2. For any n> 1, we have {n — 1)^ < d{n) < O(n^). 

The lower bound is due to [2] and the uper bound is due to [7] and [8]. 

A similar problem for some classes of automata can be disscussed. For in- 
stance, an automaton A = (S, A, S) is said to be commutative if s(uw)'^ = s(fu)'^ 
holds for any s G S and any u,v G X*. By dcom{n), we denote the value 
max{d{A) \ A = (S, A, (5) is commutative and directable, and |S| = n|. In the 
definition of dcom{n), X ranges over all finite nonempty alphabets. The following 
result is due to [9] and [10]. 

Proposition 3. For any n>l, we have dcom(ji) = n — 1. 

2 Nondeterministic Directable Automata 

A nondeterministic automaton A = {S,X,6) consists of the following data: (1) 
S', A are the same materials as in the definition of finite automata. (2) (5 is a 
relation such that 6{s, a) C S for any s G S and any a G X U {ej. 

As in the case of finite automata, <5 can be extended to the following relation 
in a natural way, i.e. i5(s, au) = Ut6i(s a) ^ G S, any u G A* and 

any a G X U {ej. In what follows, we will write sw^ instead of S{s,u) as in the 
case of finite automata. 

Now we will deal with nondeterministic directable automata and their related 
languages. For nondeterministic automata, the directability can be defined in 
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several ways. In each case, the directing words constitute a regular language. 
We will consider six classes of regular languages with respect to the different 
definitions of directability. 

Let A = (S,X,S) be a nondeterministic automaton. In [5], the notion of 
directing words of A is given. In the definition, Sw^ denotes Uses for 

w G a:*. 

Definition 2. (I) A word w G X* is Di- directing word@Di-directing if sw^ yf 0 
for any s G S' and |Sw^| = I. (2) A word w G X* is T> 2 -directing if sw^ = Sw^ 
for any s G S. (3) A word w G X* is D^-directing if n SW'^ yf 0. 

sGS 

Definition 3. Let i = 1,2, 3. Then A is called a Di-directahle automaton if the 
set of Di-directing words is not empty. 

Let A = (S, X, 6) be a nondeterministic automaton. Then, for any i = 1, 2, 3, 
T>i{A) denotes the set of all Di-directing words. Then we have: 

Proposition 4. For any i = l,2,3,Di(A.) is a regular language. 

A nondeterministic automaton A = (s, X, 5) is said to be complete if sa'^ yf 0 
for any s G S and any a G X. As for the Di-directability of a complete nondeter- 
ministic automaton, Burkhard introduced it in [1]. We will investigate the classes 
of languages consisting of Di-, D 2 - and Ds-directing words of nondeterministic 
automata and complete nondeterministic automata. 

The classes of Di-directable nondeterministic automata and complete non- 
deterministic automata are denoted by Dir(z) and CDir(t), respectively. Let X 
be an alphabet. For i = 1,2,3, we define the following classes of languages: 

(1) ^^D(i) = m-A) I ^ = (S,X,S) G Dir(*)}. (2) I 

A={S,X,6) G CDir(f)}. 

Let D be the class of deterministic directable automata. For A gT), D{A) 
denotes the set of all directing words of A. Then we can define the class, i.e. 
= {D(yi) I A = (S', a:, ,5) G D}. 

Then, by Propsition 1 and Proposition 4, all the above classes are subclasses 
of regular languages. Figure 1 depicts the inclusion relations among such 7 
classes. In [3], the inclusion relations among more classes are provided. 

We will consider the shortest directing words of nondeterministic automata. 

Let i = 1,2,3 and let A = (S, X, 6) be a nondeterministic automaton. Then 
di(A) denotes the value min{\u\ \ u G T>i{A)}. For any positive integer n > 1, 
di{n) denotes the value max{di{A) | A. = (S', X, (5) : A G Dir(t) and |S| = n}. 
Moreover, cdi{n) denotes the value max{di{A) \ A = {S,X,5) : A G CDir(z) 
and |S| = n}. Notice that in the definitions of di{n) and cdi{n), X ranges over 
all finite nonempty alphabets. 
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n X r X r X 

'^ND(l) '^ND(2) '^ND(3) 




p X p X p X 

— '^CND(2) — '^CND(3) 



n£ 



X 

ND(2) 



n£ 



X 

ND(3) 



Fig. 1. Inclusion relations 



In [1], Burkhard determined the value cdi(n) as follows: 
Proposition 5. Let n > 1. Then cdi{n) = 2" — n — 1. 

For d\{n), we have the following new result. 



Proposition 6. Let n > 2. Then 2" — n < di{n) < E (^)(2^ — 1). Notice that 

k—2 

di(l) = 0 and di{2) = 3. 

n 

Proof Let n > 2. First, We show that d\{n) < y^(^)(2^ — 1). Let yi = (S', X, i5) 

fe =2 

be a Di-directable automaton with n states and let w = ai02 • • • G Di(yi) such 
that Oi G X,i = 1,2, ...,r, r > 1 and |w| = r = di{A). Since w G Di(yi), there 
exists So G S such that sw^ = {sq} for any s G S. For any i = 1,2, ... ,r, we 
define the set Si and Ti as follows: (1) Si = S(ai02 • • • Oi)'^ . (2) = {t G Si | 

t{a^+la ^+2 • • • ar)-^ = {so}}. 

Let s G S and let z = 1, 2, . . . , r. Since s(ai02 • • • aifli+i • • • = (s(oia2 • • • 

ai)'^)(ai+i • • • Qr)'^ = {so}, we have s{aia 2 ■ ■ ■ n Ti yf 0. Let S = Sq = Tq. 
Consider the set {(Si, Ti) | z = 0, 1, 2, . . . , r — 1}. It is obvious that Si ^ % for 
any z = 0, 1, . . . , r — 1. It is also obvious that |So| yf 1. Suppose that |Si| = 1 for 
some z = l,2, — 1. Then Si = Ti = |t} for some t G S. By the definition 

of Ti, this means that aia 2 ---ai G Di(yi), which contradicts the minimality 
of |zc|. Therefore, |Si| yf 1 for any z = l,2,...,rz. Hence the set |(Si,Ti) | 
z = 0, 1, 2, . . . , r — 1} does not contain any ({s}, |s}) with sq yf s G S. 

Now assume that (Si, Ti) = {Sj,Tj) for some i,j = 1, 2, . . . , r — 1, z < j. Then 
it can be seen that 0102 • • • aiOj+iay+2 • • • G Di(yi), which contradicts the 
minimality of |zc|. Hence all (Si, Ti), z = 0, 1, 2, . . . , r — 1, are distinct. Therefore, 

n n 

|{(S„ T,) I z = 0, 1, 2 , . . . , r - 1}| < y] (”) (2^= - 1) and hence r (2" - !)• 

fc =2 fe =2 

We will show that 2” — zz < di{n). It is obvious that c?i(2) > 2. Let n > 3. 
We will construct a Di-directable automaton A = (S,X,S) such that |S| = n 
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and di{A) = 2" — n. Let S' be a finite set with |S| = n and let {Ti, T 2 ,. . . , T^} = 
{T C S I |T| > 2}. Notice that r = 2” — n — 2. Moreover, we assume that 
l^’il > \T 2 \ > ••• > \Tr\, {sq} = S\Ti and = {si,S 2 }. Now we construct the 
following nondeterministic automaton A = (S, AT, i5): (1) X = {a\, 02 , . . . , a^, b}. 
(2) For any i = 1, 2, . . . , r — 1, saf = if s G and saf = S, otherwise. (3) 
= S 2 o;^'^ = {si} and saf = S if s G S\{si,S 2 }. (4) sob-^ = 0 and sb-^ = Ti 
for any s G S \ {sq}. 

Let s G S and let z = 1, 2, . . . , r. Notice that s{aibaia2 ■ ■ ■ af'^ = {si} and 
hence 0^60102 • • • G Dx{A). Moreover, since sq6'^ = 0, we have bX* fM)x{A) = 
0. Let z,j = 1,2, Then S^aiaf'^ = S. On the other hand, s{aib)'^ = Ti 
for any s G S. This means that u G aibX* if zz is a shortest Di-directing word 
of A. Let z = 1,2, ... ,r — 1. Then Tfaiaf'^ = = S if j > z + 1 and 

Ti(aiOj)'^ = Ti_|_io^ O Tjj^i if j < i. Notice that in the latter case j + 1 < z + 1. 
This implies that u is not a shortest Di-directing word of A. if zz G X*aiajX* 
where j yf z+ 1. Moreover, since Sb'^ = Ti, u is not a shortest Di-directing word 
of yi if zz G XX^bX*. Consequently, 0160102 •••Or is a shortest Di-directing 
word of A, i.e. c?i(yi) = r + 2 = 2" — n. Hence we have 2” — zz < di(rz). 

Finally, we compute di(l) and di(2). It is obvious that di(l) = 0. Consider 
the following nondeterministic automaton A = ({1, 2}, {o, 6, c}, 6): (1) lo'^ = 
{1,2} and 2o-^ = {2}. (2) 16-^ = 0 and 26-^ = {1,2}. (3) Ic-^ = {1} and 
2c^ = 0. 

Then abc is a shortest Di-directing word of A. Since di(2) < 2^ — 1 = 3, we 
have c?i(2) = 3. 

Now we consider the value dz{n). Before dealing with the value dz{n), we 
define a nondeterministic automaton of partial function type. 

A nondeterministic automaton A = {S, X, S) is said to be of partial function 
type if I fin'll < 1 for any s G S and any a G X. Then we have: 



Remark 1. Let A be a nondeterministic automaton of partial function type. 
Then D 3 (yf) = Di(yi). 

Let A = {S,X,S) be a Ds-directable automaton of partial function type. 
Consider the following procedure IP: Let zz G D 3 (yf). Assume that zz = U 1 U 2 U 3 
where zzi,zz 3 G X*,U2 G X~^ and Sui'^ = S{uiU2)'^- Then procedure T can be 
applied as zz uiu^. 

Then we have the following result. 

Lemma 1 . In the above procedure, we have u\u^ G T>z{A). 

Proof Let A = (S, X, 6 ) be a nondeterministic automaton of partial func- 
tion type. Moreover, let zz = U1U2U3 where zzi,zz 3 G X*,U2 G X~^ and Sui“^ = 
S{uiU 2 )'^- Since zz G D 3 (A.), there exists sq G S' such that szz'^ = {sq} for any 
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s G S. From the assumptions that Sui"^ = S{uiU2)'^ and yi is a nondetermin- 
istic automaton of partial function type, it follows that sw^ = s{uiU2U3)'^ = 
s(uiM 3)'^ = {so} for any s G S. By Remark 1 , this means that U1U3 G D3(yl). 

Let A = (S', X, i 5 ) be a Ds-directable automaton of partial function type and 
let 0102 • • • Or € 03(^1) such that soi'^ = toi'^ for some s,t G S, s t. 

Assume that v G Ds(A),v = viV2V3,vi,V3 G X*,V2 G A+, |Swi'^| = 
|S(wit’2)'^| and {s,t} C Svi'^. Then procedure Q(s,t) can be applied as v =^>^(0.*) 

'Cl0l02 • • • Or- 



Then we have the following results. 

Lemma 2 . In the above procedure, we have V\aia2 ■ ■ ■ ar G 03(^1) and |Sfi'^| > 

|St-iOi^|. 

Proof Let s G S. Since v = V1V2V3 G D3(A), we have sv\'^ yf 0 , actu- 
ally Isui'^l = 1 . Notice that 3 sr G S,Wt G S, t(ai02 • • • o^)'^ = {sr}- There- 
fore, s(wiai02 • • • = (sz;i'^)(ai02 • • • a^)'^ = {s^} and hence uiai02 • • • Or G 

D3(A). Since a is of partial function type and {s,t} C Svi'^, |Sui'^| > |Sz;iOi'^|-|- 
1 . This completes the proof of the lemma. 

Lemma 3 . Let A = (S,X,S) be a T>3-directable automaton such that |S| = 
n and d3{A) = ^3(71). Then there exists a nondeterministic automaton “B = 
(S,Y,j) of partial function type such that d3(‘B) = d3{n). 

Proof Let u = 0i02 • • • o^ G L>3(A) with |u| = d3{A). Since u G D3(A), there 
are Sr G S and a sequence of partial functions of S into S, pi, p2, ■ ■ ■ , Pr such that 
s(oi02 • • • Oj)-^ A pi{pi_i{- • • (pi(s)) • • • )) for any s G S and any z = 1 , 2 , . . . ,r. 
Furthermore, pr(pr-i(' • • (pi(s)) • • • )) = {^r} for any s G S. Now we define 
the automaton of partial function type “B = (S, F, 7) as follows: ( 1 ) Y = {bi \ 
i = 1 , 2 , . . . ,r}. Remark that 61, 62, • ■ • , are distinct symbols. ( 2 ) sbi^ = Pi{s) 
for any s G S and any z = 1 , 2 , . . . , r. 

Then B is a nondeterministic automaton of partial function type. Moreover, 
it is obvious that 6162 ■ ■ - br G D3(B). Suppose that 676^3 ' ' 'hk G D3(‘B) where 
zi,Z2,...,Zfe G { 1 , 2 , ...,r}. Then we have ai-^Ui^ • • • ai^, G D3(A). Therefore, 
k > r and r = d3{'B). This completes the proof of the lemma. 

We are now ready to determine an upper bound for ^3(71). 

n—1 n—2 

Proposition 7 . For any n > 3 , c?3(n) < 

fc=2 fe=0 

Proof By Lemma 3 , there exists a nondeterministic automaton of partial 
function type A = {S,X, 5 ) such that |S| = n and ^3(71) = c?3(A). Let u = 
aia2"-ar G D3(A) with r = ^3(71) and let Si = S'(oia2 • • • for i = 
l, 2 ,...,r. Since A is of partial function type and r = ^3(71) = d3{A), |S| > 
I'S'il > IS'21 > • • • > ISr-il > |S'r| = 1 - Let Sr = |sr}- By Lemma 1 , S, S\, S2, ■ ■ ■ , 
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Sr-i and Sr are distinct. Moreover, since [S'] > |S'i|, there exist so,si G S such 
that So Si and sofli'^ = siai“^. Therefore, we can apply procedure Q(so,si) to 
ai02 • • • Or if necessary and we can get 0102 • • • 1110102 • • • a^. Now we 

apply procedure IP to Uiaia2 • • • Or as many times as possible until we cannot ap- 
ply procedure IP anymore. Hence we can obtain w G D3(yi) with |i<;| — [S']. 

Then we apply procedure Q(so,si) to w. We will continue the same process un- 
til we cannot apply either procedure T nor Notice that this process 

will be terminated after a finite number of applications of procedures IP and 
Let w = ciC 2 ---Cs,Ci G X,i = 1,2, ...,s be the last Ds-directing 
word of A which was obtained by the above process. Let = S(ciC 2 ■ ■ ■ c^)'^ 
for any i = l,2,...,s. Then Ti ^ Tj for any i,j = 1,2, ...,s with i < j 
and {Ti, T2 , . . . , Ts} contains at most n — 2 elements Ti,i = 1,2, ...,s with 

n—2 

Ti 3 {so,si}. Since |{T C S \ {so,si} C T}\ = and by the above ob- 

k—0 

n—1 n—2 

servation (includig Lemma 2), we have dz{n) < EC)-E("d) + "-i- 

fc =2 fc =0 



For the lower bound for d'i{n), we have the following new result. 

Proposition 8. Letn> 3. Then dn{n) > 2™-bl ifn = 2m (^(n) > 3-2^~^ + l 
if n = 2m + 1 ). 

Proof Let n > 3 and let S = {l,2,...,n}. Moreover, let Si = {1,2}, let 
S 2 = (3, 4}, . . . , let Sm-i = {2m — 3, 2m — 2} and let Sm = {2m — 1, 2m| if 
n = 2m {Sm = |2to — 1, 2m, 2m -I- 1} if n = 2m + 1). 

We define the following Ds-directable automaton A= {S,X,5): 



(1) 


{Ti,T 2 ,.. 


■ ,Tk} = {{ni,m. 


>, . . .,nm} 1 (ni,n2, . 


■ ■ , ^m) G 


X 

X 


* ^ Sm{ 


where k = 2™ if n = 2 m {k = 


3. 2^-1 


if n = 2 m + 


1). (2) Ti 


= { 1 , 3 , 5 ,.. 


. , 2m — 


Il- 


( 3 ) X = 


{0, 6i,62,...,6fe_ 


-2, 6fe_i, 


c}. ( 4 ) la^ 


= 2o^ = 


{l}, 3 a^ = 


4 o-^ = 


ls}, 


, ...,(2m 


— 3 )a'^ = (2m — 


2)a^ = 


= | 2 m — 3 } and ( 2 m — 


l)o'^ = { 2 m)a'^ = 


{2m — 1} if 


n = 2 m {{ 2 m — 


l)a-^ = 


= { 2 m)a'^ = 


(2m -1- l)a'^ = (2m 


-1} if 


n = 


2 m + 1). 


( 5 ) Let f = 1 , 2 , 


...,k- 


1 . By Pj, we 


denote a 


bijection of 


0 

0 

E-s" 



Ti+i. Then tbi'^ = pi{f) for any t G Ti and tbi^ = 0, otherwise. (6) tc^ = {1} 
for any t G Tk and tc^ = 0, otherwise. 

Then it can be easily verified that 06162 • • • bk-ic is a unique shortest Da- 
directing word of A. Therefore, d3(n) > 2™ -|- 1 if n = 2m {d^{n) > 3 • 2™“^ -I- 1 
if n = 2m + 1). 

Now we consider the values cd 2 {n) and d 2 {n). The lower bound is due to [1] 
and the upper bound is followed by [5]. 

Proposition 9. For n > 2, 2” — n— 1< cd 2 {n) < d 2 {n) <1-1- (2” — 2)(^^ ). 
Remark that 0^2(1) = c^2(l) = 0. 




132 Masami Ito and Kayoko Shikishima-Tsuji 



Finally, we provide a result on the value of cdz{n). The result is due to [2] 
and [5]. 

Proposition 10. Let n>\. Then (n — 1)^ < cd^{n) < 1 + (n — 2)(”). 

3 Commutative Nondeterministic Directable Automata 

In this section, we will deal with commutative nondeterministic automata and 
related languages alongside the same line as that of the previous section. 

A nondeterministic automaton A = (S,X,S) is said to be commutative if 
s{ab)'^ = s{ha)'^ holds for any s G S and any a,b G X. nondeterministic au- 
tomata@commutative 

By '^.'□,£' 0 ^ 0 ( 1 ) = 1)2,3, we denote the classes of reg- 

ular languages of directing words of deterministic commutative automata, of 
Di-directing words of complete commutative nondeterministic automata, and of 
Dj-directing words of commutative nondeterministic automata, respectively. 

Then we have the following inclusion relations among these classes (see 
Figure 2). 



P ,x _ r /X 
^ ND(1) — ^ ND(3) 



p p p /A P /vA P /X 

^ B — ND(2) — ^ CND(l) — ^ CND(2) — ^ CND(3) 



Fig. 2. Commutative case 



Now we will consider the shortest directing words of commutative nondeter- 
ministic automata. The results in this section are due to [4] . 

Let i = 1,2,3 and let n > 1. Then cdcom{i){n) denotes the value max{di{A) 

I A = (S', A, (5) : commutative, A G CDir(t) and |S| = n}. 

Notice that in the definitions of dcom(i)(n) and cdcom{i){n) , X ranges over all 
finite nonempty alphabets. 

Proposition 11. For any n > 1, dcom{i)(j>-) = cdcom(i){iT') = n — 1. 

Proposition 12. Let n > 2. Then (n — 1)^ -I- 1 < cdcom( 2 ){n) = dcom( 2 ){n) < 
2” - 2. For n = 1, c4o^(2)(l) = 4om(2)(l) = 0. 

Proposition 13. Let n > 2. Then n'^ — 3n + 3 < cdcom( 3 ){n) = dcom( 3 )(^) < 
\ + {n- 2)(”). For n = l,c4o^(3)(l) = 4om(3)(l) = 0- 
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As for more detailed information on deterministic and nondeterministic di- 
rectable automata, refer to [6] . 
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Abstract. We consider sets of rectangles and squares recognized by de- 
terministic and non-deterministic two-dimensional finite-state automata. 
We show that sets of squares recognized by DFAs from the inside can 
be as sparse as any recursively enumerable set. We also show that NFAs 
can only recognize sets of rectangles from the outside that correspond to 
simple regular languages. 



1 Introduction 

Two-dimensional languages, or picture languages, are an interesting generaliza- 
tion of the standard languages of computer science. Rather than one-dimensional 
strings, we consider two-dimensional arrays of symbols over a finite alphabet. 
These arrays can then be accepted or rejected by various types of automata, 
and this gives rise to different language classes. Such classes may be of interest 
as formal models of image recognition, or simply as mathematical objects in 
their own right. 

Much of this work has focused on two- or more-dimensional generalizations 
of regular languages. We can define the regular languages as those recognized by 
finite-state automata that can move in one direction or both directions on the 
input, and which are deterministic or non-deterministic. We can also consider 
finite complement languages, in which some finite set of subwords is forbidden, 
and then project onto a smaller alphabet with some alphabetic homomorphism. 
In one dimension, these are all equivalent in their computational power. 

In two dimensions, we can consider 4~way finite-state automata, which at 
each step can read a symbol of the array, change their internal state, and move 
up, down, left or right to a neighboring symbol. These can be deterministic or 
non-deterministic, and DFAs and NFAs of this kind were introduced by Blum 
and Hewitt [1]. Similarly, we can forbid a finite number of subblocks and then 
project onto a smaller alphabet, obtaining a class of picture languages which 
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we call homomorphisms of local lattice languages, or h(LLL)s [9]. (These are 
also called the recognizable languages [3] or the languages recognizable by non- 
deterministic on-line tesselation acceptors [5].) While DFAs, NFAs and h(LLL)s 
are equivalent in one dimension, in two or more they become distinct: 

DFA C NFA C h(LLL) 

where these inclusions are strict. Reviews of these classes are given in [9,4,7, 
12], and a bibliography of papers in the subject is maintained by Borchert at [2]. 

A fair amount is known about the closure properties of these classes as well. 
The DFA, NFA, and h(LLL) languages are all closed under intersection and 
union using straightforward constructions. 

The situation for complement is somewhat more complicated. DFAs are 
closed under complement by an argument of Sipser [13] which allows us to re- 
move the danger that a DFA might loop forever and never halt. We construct a 
new DFA that starts in the final halt state, which we can assume without loss 
of generality is in the lower right-hand corner. Then this DFA does a depth- first 
search backwards, attempting to reach the initial state of the original DFA. (We 
use a similar construction in Section 2 below.) This gives a loop-free DFA which 
accepts if and only if the original DFA accepts, and since a loop-free DFA al- 
ways halts, we can then switch accepting and rejecting states to recognize the 
complement of the original language. 

It is also known that h(LLL)s are not closed under complement, and in [8] 
we proved that NFAs are not closed under complement either. In the process, we 
proved in [8] also that NFAs are more powerful than DFAs, even for rectangles of 
a single symbol. We note that recognition of rectangles by NFAs, in the context of 
“recognizable functions,” is studied in [6], where some upper bounds are shown. 

In this paper we consider recognizing picture languages over a single letter 
alphabet, that is, recognizing squares and rectangles. We start by investigat- 
ing sets of squares recognized by DFAs, and we show that DFAs can recognize 
squares whose sizes belong to sets which are as sparse as any recursively enu- 
merable set. This resolves an open question raised in [9] about how sparse these 
sets can be. 

We then consider the question of what sets of rectangles can be recognized by 
an NFA or DFA from the outside, i.e. by an automaton which is not allowed to 
enter the rectangle but can roam the plane outside it. We show that any such set 
corresponds to a simple regular language, and therefore that such an automaton 
can be simulated by a DFA moving along the rectangle’s boundary. 

2 Square Recognition 

Our model of four-way finite automaton is sensitive to the borders of the input 
rectangle. In other words, the automaton knows which of the neighboring squares 
are inside and which are outside of the rectangle. Based on this information and 
the current state of the automaton the transition rule specifies the new state 
of the automaton and the direction Left, Right, Up or Down of movement. The 
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automaton is not allowed to move in a direction that takes the automaton to 
the other side of the border. 

In this section we consider the recognition problem from the inside of the 
rectangle. In this case the automaton is initially located inside the rectangle, at 
the lower left corner. The rectangle is accepted if and only if the automaton is 
able to eventually reach its accepting state. A DFA has a deterministic transition 
rule, while in NFA there may be several choices of the next move. The automaton 
recognizes set S' C where 

S = {(w, h) I the automaton accepts the rectangle of size w y. h}. 

A DFA can easily check if a given rectangle is a square by moving diagonally 
through the rectangle. Since the class of sets recognized by DFAs or NFAs are 
closed under intersection, the set of squares recognized by a DFA or NFA is 
recognizable by an automaton of the same type. A natural question to ask is 
what sets of squares can be recognized by NFA or DFA. We say that a set S' C N 
is square-recognizable by an NFA (DFA) if the set of squares {(n, n) | n G S} is 
recognizable by an NFA (DFA). 

Example 1. [9] The set {2" | n G N} is square-recognized by a DFA that uses 
knight’s moves to divide the distance to a corner by two, as in Figure 1. 




Fig. 1. Using knight’s moves to recognize 2" x 2" squares 



Example 2. A slightly more complex construction shows that the set {2^" | n G 
N} can be square-recognized. In this construction a signal of rational speed 3/4 
is repeatedly used to move the automaton from position 4" into position 3”, in 
the same style as the knight moves was used in the previous example. That is, we 
multiply the automaton’s position on an edge by 3/4 until the result is no longer 
an integer. Then signals of speed 2/3 are iterated to move the automaton from 
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position 3" into position 2". This process accomplishes the following changes in 
the automaton’s position: 



2 ^ = 4 ^ 



The process is iterated until position 2 = 2^° is reached, in which case the square 
is accepted. If it ends at position 1 instead, we reject. Note that we can check 
the position mod 2, 3, or 4 at any time by using a finite number of states. At 
the very beginning we can verify that the square size is a power of 2, using the 
DFA from our first example. 



We see from these examples that rather “sparse” sets of squares can be 
recognized, even by deterministic finite automata. In what follows we prove that 
these sets can be made as sparse as any recursive (or recursively enumerable) 
set. This result is optimal because any square-recognized set has to be recursive. 

Theorem 1 . Let {01,02,...} be any recursively enumerable set of positive in- 
tegers. There exists a 2 D DFA that square-recognizes a set {61,62, . . .} such that 
bi > tti for all i = 1,2,... 

Proof. We exploit classical results by Minsky [11] and a very natural correspon- 
dence between automata inside rectangles and 2-counter machines without an 
input tape. A 2-counter machine without an input tape consists of a determinis- 
tic finite state automaton and two infinite registers, called counters. The counters 
store one non-negative integer each. The machine can detect when either counter 
is zero. It changes its state according to a deterministic transition rule. The new 
state only depends on the old state and the zero vs. non-zero status of the two 
counters. The transition rule also specifies how to change the counters: They may 
be incremented, decremented (only if they are non-zero) or kept unchanged. The 
automaton accepts iff it reaches a specified accepting state. We define the set 
recognized by such a device as the set of integers k such that the machine enters 
the accepting state when started with counter values 0 and k. 

It follows from Minsky’s construction in [11] that for every recursively enu- 
merable set X of positive integers there exists a 2-counter machine A that accepts 
the set 

y = {2"|nG X}. 

Moreover, we may assume that the 2-counter machine A never loops, i.e., never 
returns back to the same state and counter values it had before. This assumption 
means no loss of generality because we can start Minsky’s construction from 
a Turing machine that does not loop. Such Turing machine exists for every 
recursively enumerable set. (It may, for example, count on the tape the number 
of instructions it executes.) 

Notice that a 2-counter machine can be interpreted as a 2D DFA that op- 
erates on the (infinite) positive quadrant {(x,y) \x,y > 0} of the plane and 
has the same finite states as the 2-counter machine. The values of the coun- 
ters correspond to the coordinates of the position of the automaton. Increments 
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and decrements of the counters correspond to movements of the DFA on the 
plane. Counter values zero are the positive axeses (x, 0) and (0, y), which can be 
detected by the DFA. 

An accepting computation of a 2-counter machine can therefore be simulated 
by a 2D DFA inside a sufficiently large square. The square has to be at least as 
large as the largest counter value used during the accepting computation. Using 
this observation we prove that if V is any set of positive integers accepted by a 
2-counter machine A that does not loop then some 2D DFA square-recognizes 
the set 

Z = {/cGN| A: is the largest counter value during the accepting 
computation by A of some i G Y }. 

It is very easy to construct a non-deterministic 2D automaton that square- 
recognizes set Z: The NFA moves along the lower edge of the square and uses 
the non-determinism to guess the starting position of the 2-counter machine A. 
Then it executes A until A accepts, halts otherwise or tries to move outside the 
square. Moreover, the NFA memorizes in the finite control unit if it ever touched 
the right or upper edge of the square, indicating that a counter value equal to 
the size of the square was used. The square is accepted if and only if A accepts 
after touching the right or the upper edge of the square. Note that the forward 
simulation part of this process is deterministic. Non-determinism is only needed 
to find the correct input to the 2-counter machine. 

To accept set Z deterministically we use the determinism of the 2-counter 
machine to depth-first-search the predecessor tree of a given configuration C to 
see if it is on a computation path for some valid input. This is done by trying 
all possible predecessors of a configuration in a predetermined order, and recur- 
sively processing all these predecessors. When no predecessors exist the DFA 
can backtrack using the determinism of the 2-counter machine, and continue 
with the next branch of the predecessor tree. Note that the predecessor tree 
- restricted inside a square - is finite and does not contain any loops so the 
depth-first-search process will eventually terminate. It finds a valid starting con- 
figuration that leads to the given configuration C if it exists. If no valid input 
configuration exists the process may identify an incorrect starting configuration 
if the depth-first-search backtracked beyond C. In any case, if a potential start 
configuration is found, a forward simulation of the 2-counter machine is done to 
see if it accepts after touching the right or the upper edge of the square. 

The depth-first-search process described above is done on all configurations 
C that are close to the upper left or the lower right corners of the input square. 
If the two counter machine A has s states then any accepting computation with 
maximum counter value k must enter some cell within distance s of one of the 
corners: otherwise counter value k is present in a loop that takes the automaton 
away from the axeses, and the computation cannot be accepting. 

There are a finite number of configurations C within distance s from the 
corners, so the 2D DFA can easily try all of them one after the other. 

Now we are ready to conclude the proof. If A = { 01 , 02 , . . .} is any r.e. set 
of positive integers then there exists a 2-counter machine A that does not loop 
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and accepts Y = {2“% 2“^, . . .}. The construction above provides a 2D DFA that 
square accepts the set Z = {6i, 62, ■ • ■} where bi is the maximum counter value 
during the accepting computation of 2“* by A. Clearly bi > 2“* > a,. 

Another way of stating the previous theorem is 

Corollary 1. Let f : N — > N be any computable function. Then some 2D DFA 
square accepts the range of some function g : N — > N that satisfies g{n) > f(ji) 
for all n € N. 

It is interesting to note that there seem to be no tools whatsoever, at the 
present time, to prove that a set S cannot be square-recognized by a DFA. To 
inspire the creation of such tools, we make the following conjecture: 

Conjecture 1. No DFA or NFA in two dimensions can recognize the set of squares 
of prime size. 

In contrast, the reader can easily show that there is a two-dimensional DFA 
that recognizes squares of size 11^ for p prime. This works by first moving p from 
the 11-register to the 2-register, that is, by converting 11^ to 2^ as in Example 2 
above. It then copies p into to the 3-register, and uses the 5- and 7-registers to 
maintain a number m by which it divides p, with a loop that passes m back and 
forth between the 5- and 7-registers while decrementing the 3-register. After each 
division, it increments m and recopies p from the 2-register to the 3-register. We 
halt if m exceeds [p/4j , which we can check with a similar loop; this suffices for 
all p > 9. Then since 2^’3^7^/^ < IP all this can take place inside a square of 
size IIP. 

In addition, in [9] DFAs are given that recognize cubes of prime size, and 
tesseracts of perfect size, in three and four dimensions respectively. These use a 
billiard-ball-like computation to check whether two integers are mutually prime; 
thus they simulate counter machines which branch both when the counter is zero 
and when it reaches its maximum, i.e. the size of the cube. 

We also conjecture the following: 

Conjecture 2. NFAs are more powerful than DFAs on squares of one symbol. 

Since we have proved in [8] that NFAs are more powerful than DFAs on rectangles 
of one symbol, it would be rather strange if they were equivalent on squares. 

3 Recognizing Rectangles from Outside 

Let us consider next the “dual” problem of recognizing rectangles using a finite 
automaton that operates outside the rectangle and is not allowed to penetrate 
inside. Initially the automaton is located outside one of the corners of the rectan- 
gle, and the rectangle is accepted if the automaton is able to reach its accepting 
state. Not surprisingly, such devices turn out to be less powerful than automata 
that operate inside rectangles. We prove that if S' C is the set of rectangles 
recognized by an NFA from outside, then the language 

Ls = m^\{i,j)eS} 
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is regular. In other words, the same set of rectangles can be recognized by a DFA 
that moves east and south along the edges of the rectangle from the upper left 
corner to the lower right corner. A similar result was shown for DFAs which are 
allowed to make excursions into the plane outside the rectangle by Milgram [10]. 

Lemma 1. Let S' C &e recognized from the outside by an NFA A. Then there 
exist positive integers tn and pn, called the horizontal transient and the period, 
respectively, such that for every w > tn and every h 

(w, h) G S {w + pH,h) G S 

where the numbers tn and pn are independent of the height h. 

Proof. We generalize an argument from [9] for rectangles of height 1. Consider 
first computation paths P by A on unmarked plane that start in state q at 
position (0, 0) and end in state r on the same row y = 0. Assume that P is 
entirely above that row, i.e. all intermediate positions (x,y) of P have y > 0. 
Let us prove the following fact: there must exist positive integers t and p such 
that, if P ends in position {x, 0) with x > t then there is a computation path 
P' from the same initial position (0,0) and state q, into position (x + p, 0) and 
state r, such that also P' is entirely above the row y = 0. In other words, the 
possible horizontal positions where A returns to the initial row y = 0 in state r 
are eventually periodic. 

To prove this, notice that sequences / of instructions executed by A during 
possible P’s form a context-free language, accepted by a pushdown automaton 
that uses its stack to count its vertical position and its finite state control to 
simulate the states of A. Recall that a Parikh vector of a word is the integer 
vector whose elements are the counts of different letters in the word. It is well 
known that in the case of a context-free language the Parikh vectors of the 
words in the language form a semi-linear set, that is, a finite union of linear sets 
{u + I Oi = 0, 1, . . .} for integer vectors u and Vi, i = 0,1, . . . , k. 

Any linear transformation keeps linear sets linear; therefore, the possible 
values of nr{I) — ni{I) form a semilinear set of integers, where nr{I) and n/(/) 
indicate the numbers of right- and left-moving instructions in the word I, respec- 
tively. Because nr{I) — rii{I) is the total horizontal motion caused by executing 
I, we conclude that possible horizontal positions at the end of P form a semi- 
linear set + ip j \i = 0, 1, . . .}. By choosing t to be the maximum of all the 

tjA and p as the lowest common multiple of the positive Pj’s, we obtain numbers 
that satisfy the following claim: if a; > t is the horizontal position at the end of 
some P, then x = tj + ipj for some numbers i > 1 and j with pj > 0. Because 
Pj divides p we see that a; -I- p is in the same linear set {tj + ipj\i = 0,1, .. .{ as 
X is. 

As proved above, the transients t and the periods p exist for all start states 
q and end states r. We can find a common transient and period for all start and 
end states by taking the lowest common multiple of the periods p and the largest 
of the transients t. 
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Fig. 2. Four parts of the plane 



Now we are ready to proceed with the proof of the Lemma. Let us divide the 
plane outside a rectangle into four parts as indicated in Figure 2. Parts L and R 
are semi-infinite strips to the left and to the right of the rectangle. Consider an 
accepting computation C for awxh rectangle and let us divide C into segments 
at the transitions between the four parts. More precisely, a segment will be a part 
of C that is entirely inside one of the four regions, and the automaton touches 
the border between two regions only at the beginning and end of each segment. 

If the width w of the rectangle is increased, all segments of C can be executed 
unchanged except segments that take the automaton from L to i? or from R to 
L through areas U or D. 

Let us study a segment that takes the automaton from L to R through U. 
Let us divide it into subsegments that start and end on the same row as the 
beginning and end of the entire segment, so that inside the subsegments all 
positions are strictly above that row. See the illustration in Figure 3. Notice 
that the subsegments start and end in positions that touch the upper edge of 
the rectangle. The only exceptions are the start and end positions of the first 
and last subsegment, which lie above L and R respectively. 




Fig. 3. A sample segment under consideration 
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If any of the subsegments moves the automaton more than t cells to the right 
where t is the number from the first part of the proof, then according to the first 
part that subsegment can be replaced by another one that moves the automaton 
k ■ p additional cells to the right, for any k = 1,2, — This allows us to ‘pump’ 
this segment of the computation farther to the right. 

If, on the other hand, every subsegment moves A either to the left or at 
most t cells to the right, we can still pump the segment farther right as follows. 
Let s be the number of states in A. If the width w of the rectangle is at least 
(s + l)t then the automaton must touch the upper edge of the rectangle at an 
increasing series of times at positions x\ < X 2 < X 3 < • • • < a;g+i, all satisfying 
Xi+i — Xi < t. It follows from the pigeonhole principle that the automaton must 
be in the same state when it touches the rectangle in two different positions 
Xi and Xj for some i < j. The computation path between positions Xi and Xj 
can be repeated arbitrarily many times, leading to an additional movement by 
k • {xj — Xi) cells to the right, for any chosen k = 1,2, . . .. Because Xj — Xi < st 
we conclude that an additional movement to the right by any multiple of (st)! 
cells is possible. 

From the two cases above it follows that if the width of the rectangle is at 
least (s+ l)t then path P can be modified to cross an additional distance of the 
lowest common multiple of (st)! and p. Therefore, the width of the rectangle can 
be increased by lcm(p, (st)!). 

A similar argument can be made for segments that cross from R to L, or 
between R and L through the lower half D. By taking the maximum of the tran- 
sient lengths and the lowest common multiple of the periods we obtain numbers 
tn and pn that satisfy the lemma. The numbers are independent of the height 
of the rectangle. 

The previous lemma has, naturally, a vertical counterpart: 

Lemma 2. Let S' C &e recognized from outside by an NFA A. Then there 
exist positive integers ty and pv such that for every h > ty and every w 

(w, h) G S {w, h + py) G S 

where the numbers ty and py are independent of the width w. 

Finally, we use the following technical result: 

Lemma 3. Assume that a language L C 0*1* satisfies the following monotonic- 
ity conditions: there exist positive integers n and m such that 

0*P GL,i>n ^{Wx>i) G L, and 
GL,j>m^\\/y> j) QAV G L. 



Then L is regular. 

Proof. For every fixed k the language 

Ak = LnoH* 
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is regular: if for some j > m we have O^l-’ G L, then is the union of 
and a finite number of words that are shorter than 0^1^ . If no such j exists then 
Ak is finite and therefore regular. Analogously, languages 

Bk = Lno*i'= 



are regular. 

If L contains some word 0*1^ with i > n and j > m then 

L = Ao U Ai U • • • U Ai_i 
UBo U U • • • U Bj_i 
U0*0*P1* 



is regular since it is a finite union of regular languages. If no such word 0*1^ is 
in L then 

L = Ao U Ai U • • • U A„_i 
UBo U U • • • U Bm-i 

is a union of a finite number of sets of the form Ak and Bk, and is therefore 
regular. 



Now we have all the necessary tools to prove the main result of the section: 

Theorem 2. Suppose an NFA recognizes from outside a set S' C of rectan- 
gles. Then the language 

Ls = m^\{i,j)GS} 



is regular. 



Proof. Let tn and pn be the horizontal transient and the period from Lemma 1, 
and let tv and pv be their vertical counterparts from Lemma 2. For every a = 
0, 1, . . . ,ph — 1 and 6 = 0, 1, ... ,py — 1 let us define 



La,b = Ls n {OA^ \i = a {mod pH),j = b (mod py)}. 



Because Ls is the union of languages La, 6 it is enough to prove that every La,t 
is regular. 

But it follows from Lemmas 1 and 2 that the language 

l; ,, = {Qd-A/PHiU-b)/pv \ Qiii g 

satisfies the monotonicity condition of Lemma 3 and is therefore regular. If h is 
the homomorphism that replaces 0 with pu O’s and 1 with pv I’s then 

La,fc = 0“/l(L;,)l^ 



Thus La,h is regular. 
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Abstract. The word substitutions are binary word operations which 
can be basically interpreted as a deletion followed by insertion, with some 
restrictions applied. Besides being itself an interesting topic in formal 
language theory, they have been naturally applied to modelling noisy 
channels. We introduce the concept of substitution on trajectories which 
generalizes a class of substitution operations. Within this framework, we 
study their closure properties and decision questions related to language 
equations. We also discuss applications of substitution on trajectories in 
modelling complex channels and a cryptanalysis problem. 

1 Introduction 

There are two basic forms of the word substitution operation. The substitution 
in a by (3 means to substitute certain letters of the word a by the letters of /?. 
The substitution in a of [3 means to substitute the letters of (3 within a by other 
letters, provided that (3 is scattered within a. In both cases the overall length of 
a is not changed. Also, we assume that a letter must not be substituted by the 
same letter. 

These two operations are closely related and, indeed, we prove in Section 4 
that they are mutual left inverses. Their motivation comes from coding theory 
where they have been used to model certain noisy channels [8] . The natural idea 
is to assume that during a transfer through a noisy channel, some letters of the 
transferred word can de distorted — replaced by different letters. This can be 
modelled by a substitution operation extended to sets of words. This approach 
also allows one to take into account that certain substitutions are more likely 
than others. Hence the algebraic, closure and other properties of the substitution 
operation are of interest, to study how a set of messages (=language) can change 
when transferred through a noisy channel. 

In this paper we generalize the idea of substitution using the syntactical 
constraints — trajectories. The shuffle on trajectories as a generalization of se- 
quential insertion has been studied since 1996 [16, 17]. Recently also its inverse 
— the deletion on trajectories has been introduced [1, 10]. A trajectory acts as a 
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syntactical condition, restricting the positions of letters within the word where 
an operation places its effect. Hence the shuffle and deletion on trajectories can 
be understood as meta-operations, defining a whole class of insertion/deletion 
operations due to the set of trajectories at hand. This idea turned out to be 
fruitful, with several interesting consequences and applications [1-4, 11, 14, 15]. 

We give a basic description of these operations in Section 3. Then in Section 
4 we introduce on a similar basis the substitution and difference on trajectories. 
From the point of view of noisy channels, the application of trajectories allows 
one to restrict positions of errors within words, their frequency etc. We then 
study the closure properties of substitution on trajectories in Section 5 and basic 
decision questions connected with them in Section 6. In Section 7 we discuss a 
few applications of the substitution on trajectories in modelling complex noisy 
channels and a cryptanalysis problem. In the former case, the channels involved 
permit only substitution errors. This restriction allows us to improve the time 
complexity of the problem of whether a given regular language is error-detecting 
with respect to a given channel [13]. 

2 Definitions 

An alphabet is a finite and nonempty set of symbols. In the sequel we shall use 
a fixed alphabet S. S is assumed to be non-singleton, if not stated otherwise. 
The set of all words (over S) is denoted by S* . This set includes the empty 
word A. The length of a word w is denoted by jwj. \w\x denotes the number of 
occurrences of x within u, for w,x € E*. 

For a nonnegative integer n and a word w, we use w” to denote the word that 
consists of n concatenated copies of w. The Hamming distance H{u,v) between 
two words u and v of the same length is the number of corresponding positions 
in which u and v differ. For example, H{abba, aaaa) = 2. 

A language T is a set of words, or equivalently a subset of A*. A language 
is said to be A-free if it does not contain the empty word. For a language L, we 
write L\ to denote A U {A}. If n is a nonnegative integer, we write L” for the 
language consisting of all words of the form wi • • ■ Wn such that each Wi is in L. 
We also write L* for the language U U U • • • and A+ for the language 
A* — {A}. The notation A° represents the complement of the language A, that 
is, L'^ = E* — A. For the classes of regular, context-free, and context sensitive 
languages, we use the notations REG, CF and CS, respectively. 

A nondeterministic finite automaton with A productions (or transitions), a 
X-NFA for short, is a quintuple A = (S', E, sq, F, P) such that S is the finite and 
nonempty set of states, sq is the start state, F is the set of final states, and P is 
the set of productions of the form sx —>■ t, where s and t are states in S, and x is 
either a symbol in E or the empty word. If there is no production with x = X, the 
automaton is called an NFA. If for every two productions of the form sx\ t\ 
and SX 2 ^2 of an NFA we have that x\ X 2 then the automaton is called a 
DFA (deterministic finite automaton) . The language accepted by the automaton 
A is denoted by A (A). The size [A] of the automaton A is the number [S'] -|- jPj. 
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A finite transducer (in standard form) is a sextuple T = (5, E, E', sq, A, P) 
such that E' is the output alphabet, the components S, sq, F are as in the case 
of A-NFAs, and the set P consists of productions of the form sx yt where s 
and t are states in S', a; G A7 U {A} and y G E' U {A}. If x is nonempty for every 
production then the transducer is called a gsm (generalized sequential machine) . 
If, in addition, y is nonempty for every production then the transducer is called 
a X-free gsm. The relation realized by the transducer T is denoted by R(T). The 
size |T| of the transducer T (in standard form) is |S| + |P|. We refer the reader 
to [18] for further details on automata and formal languages. 

A binary word operation is a mapping E* x E* ^ 2^ , where 2^ is the set 
of all subsets of E*. The characteristic relation of <(> is 

= {{w,u,v) -.wGu^fv}. 

For any languages X and Y, we define 

U u<fv. (1) 

It should be noted that every subset B of E*x E* x E* defines a unique binary 
word operation whose characteristic relation is exactly B. For an operation <0 
we define its left inverse as 

w G {x(}v) iff a; G (w <(>* u), for all v,x,w G E* , 

and the right inverse <(>’’ of <(> as 

w G (u^fy) iS y G (m <(>'’ w), for all u,y,w G E* . 

Moreover, the word operation <(>' defined by m <(>' u = f <(> u is called reversed <(>. 
It should be clear that, for every binary operation <(>, the triple (w, u, v) is in 
if and only if (u, w, v) is in C^i if and only if (u, u, w) is in C<^r- if and only if 
(w, V, u) is in C^'. If x and y are symbols in {I, r,' }, the notation represents 
the operation ■ Using the above observations, one can establish identities 

between operations of the form 

Lemma 1. (i) = O’'*' = 0" = 0, 

(^^) 

(ttt) . 

Bellow we list several binary word operations together with their left and right 
inverses [6,7]. 

Catenation: u ■ v = {mu}, with X = — >rq and = — >iq. 

Left quotient: u — >iq u = (ruj if m = vw, with — >{q = •' and — >[q = •. 

Right quotient: u — >rq u = jru} if m = wv, with — >[.q = • and — >}q = — s-j^. 
Shuffle ( or scattered insertion): m LLI u = (miUi • • • UkVkUk+i | fc > 1, 
u = Ui ■ ■ ■ UfcUfc+i, u = Ui • • • Ufe}, with LLI^ = and LLI'’ = . 

Scattered deletion: u v = {ui • • • UkUk+i | A: > 1, m = uiui • • • UkVkUk+i, v = 
vi ■ ■ ■ Ufc}, with = LLI and 

We shall also write uv for u ■ v. 
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3 Shuffle and Deletion on Trajectories 

The above insertion and deletion operations can be naturally generalized using 
the concept of trajectories. A trajectory defines an order in which the operation 
is applied to the letters of its arguments. Notice that this restriction is purely 
syntactical, as the content of the arguments has no influence on this order. 
Formally, a trajectory is a string over the trajectory alphabet V = {0,1}. The 
following definitions are due to [1, 16, 10]. 

Let S be an alphabet and let t be a trajectory, t gV*. Let a, (3 be two words 
over S. 

Definition 1. The shujfle of a with (3 on the trajectory t, denoted by aLLIt/3, 
is defined as follows: 



a LUt /3 = |ai/3i . . . I a = ai . . . /3 = /?i . . . /3fe, t = 0*1 l^A . . 0**“ 1'^*’ , where 

\am\ = im and \(3m\ = jm for all m, I < m < k}. 

Definition 2. The deletion of f3 from a on trajectory t is the following binary 
word operation: 

a'^t (3 = {ai . . .Ofe I a = ail3i . . . ak/3k, (3 = (3i . . . (3k, t = 0*ilb . . . where 

\am\ = im and \(3m\ = jm for all m, I < m < k}. 

Observe that due to the above definition, if jaj yf jtj or \(3\ yf jtji, then a 
(3 = %. 

A set of trajectories is any set T C V*. We extend the shuffle and deletion 
to sets of trajectories as follows: 

Oi LLIt’ (3 = a LLIj (3, OL (3 = u a (3. (2) 

teT ieT 

The operations LLIt and '^t generalize to languages due to (1). 

Example 1. The following binary word operations can be expressed via shuffle 
on trajectories using certain sets of trajectories. 

1. Let T = 0*1*, then LLIt = •, the catenation operation, and '^t = — >rq, 
the right quotient. 

2. For T = 1*0* we have LLIt = the anti-catenation, and '^t = — ^iq, the 
left quotient. 

3. Let T = {0, 1}*, then ^t = LU, the shuffle, and ^t = the scattered 
deletion. 

We refer to [1, 16, 10] for further elementary results concerning shuffle and dele- 
tion on trajectories. 
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4 Substitution on Trajectories 

Based on the previously studied concepts of the insertion and deletion on tra- 
jectories, we consider a generalization of three natural binary word operations 
which are used to model certain noisy channels [8]. Generally, channel [ 13 ] is a 
binary relation 7 C S* x S* such that (u, u) is in 7 for every word u in the input 
domain of 7 - this domain is the set {u \ {u, u) € 7 for some word f }. The fact 
that ( m , v) is in 7 means that the word v can be received from u via the channel 
7. In [8], certain channels with insertion, deletion and substitution errors are 
characterized via word operations. For instance, the channel with exactly m in- 
sertion errors is the set of all pairs (u, u) such that u G u LLI T’'", and analogously 
for deletion errors. The following definitions allow one to characterize channels 
with substitution errors. 

Definition 3. If u,v G E* then we define the substitution in m by u as 

u M v = {uiViU2V2 ■ ■ ■ UkVkUk+i \ k> 0 , u= uiaiU2a2 ■ ■ .UkakUk+i, v= vi.. .Vk, 
GijVi € E ,1 < i < k, Gi ^ Vi, \/i,l < i < k}. 

The case k = 0 corresponds to u = A when no substitution is performed. 
Definition 4. If u,v G E* then we define the substitution in m of u as 



uAv = {uiaiU2a2 ■ ■ .UkUkUk+i \ k>0, u = U1V1U2V2 ■ ■ .UkVkUk+i, v= vi . ..Vk, 
Gi,Vi € E,1 < i < k, Gi ^ < i < k}. 

Definition 5. Let u,v G E*, |m| = juj, let H{u,v) he the Hamming distance of 
u and V. We define 



U \> V = {vxV2 . . .Vk\k = H{u,v), U = UxGi . . . UkGkUk+l, V = UiVi . . . UkVkUk+l, 
Gi,Vi G E ,\ < i < k, Gi ^ Vi,^i,l < i < k}. 

The above definitions are due to [8], where it is also shown that the left- and the 
right-inverse of M are A and l>, respectively. Given two binary word operations 
<[>i, <[>2, their composition (<[>i <[>2) is defined as 

WGu(0i02)u W G {uIfiVi)(f2V2, V = V 1 V 2 , 

for all u,v,w G E*. Then it is among others shown that: 

(i) The channel with at most m substitution and insertion errors is equal to 

{(u, u) I u G m(A L±J)(A° U • • • U A"*)}. 

(i) The channel with at most m substitution and deletion errors is equal to 

{(u, u) I u G U • • • U A”")}. 

Moreover, further consequences including composition of channels, inversion of 
channels etc. are derived. The above substitution operations can be generalized 
using trajectories as follows. 
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Definition 6. For a trajectory t gV* and u,v € S* we define the substitution 
in u by u on trajectory t as 

u^tv = {U 1 V 1 U 2 V 2 ■ ■ .UkVkUk+i I A: > 0, u = uiOi . . .UfeUfeMfe+i, v = vi . . .Vk, 
t = G r, l < i < fc, a,^Vi,yi,l<i<k, 

jz = < z < fc+ 1}. 

Definition 7. For a trajectory t € V* and u,v € S* we define the substitution 
in u of u on trajectory t as 

M A( U = {MiaiM2a2 . . . UkttkUk+l I fc > 0, U = UiVi . . . UkVkUk+l, V = Vi . . .Vk, 
t = Qi,Vi G S,1 < i < k, Qi ^ Vi,yi,l < i < k, 

ji = \u^\, I < i < k + 1}. 

Definition 8. For a trajectory t € V* and u, u G S* we define the right differ- 
ence of u and v on trajectory t as 

ut>tv = {viV 2 . . .Vk \ k > 0, u = uiai . . . UkOkUk+i, v = uiVi . . . UkVkUk+i, 

t = Qi,Vi € < i < k, Oi < i < k, 

jz = \uz\, I < i < k + 1}. 



These operations can be generalized to sets of trajectories in the natural way: 
u Mp u = (^ M u, u Arp V = u Aj u and u \>t v = [J I>t u. 

teT ieT teT 

Example 2. Let T = V*, i.e. the set T contains all the possible trajectories. 
Then lxi 7 ’=M, Ap = A and >t = >■ 

One can observe that similarly as in [8], the above defined substitution on trajec- 
tories could be used to characterize channels where errors occur in certain parts 
of words only, or with a certain frequency and so on. Due to the fact that the 
trajectory is a syntactic restriction, only such channels can be modelled where 
the occurrence of errors may depend on the length of the transferred message, 
but not on its content. In the sequel we study various properties of the above 
defined substitution operations. 

Lemma 2. For a set of trajectories T and words u,v G E*, the following 
holds: 

(i) Np = Ap and = \>p^ 

(a) Ap = Mt’ and Ap = \>'p, 

(Hi) >p = Ap and \>p = My . 

Proof, (i) Consider the characteristic relation Cn^ of the operation Mj . Observe 
that (w,u,v) G Cxit iff (u,w,u) G iff (v,u,w) G Then the statements 
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N* = A^ and 1X1 [ = \>t,t&T, follow directly by careful reading the definitions 
of lxi(, Aj and l>t- Now observe that 

u V = = = u Arp V. 

teT teT 

The proof for is analogous. 

(ii) Due to Lemma 1, = Arp implies A^ = Mj- and = \>rp implies 

A'rp = = |>^. 

(iii) Similarly, implies \>'!p = Nt’, and consequently 

= = A't . □ 

5 Closure Properties 

Before addressing the closure properties of substitution, we show first that any 
(not necessarily recursively enumerable) language over a two letter alphabet can 
be obtained as a result of substitution. 

Lemma 3. For an arbitrary language L C {a, 6}* there exists a set of trajecto- 
ries T such that 

(i) L = a*Mr b*, 

(a) L = a* Arp a* . 

Proof. Let T = <j){L), (j) : {a,b}* — > V* being a coding morphism such that 
4>{a) = 0, (f>{b) = 1. The statements follow easily by definition. □ 

Similarly as in the case of shuffle and deletion on trajectories [1, 16, 10], the 
substitution on trajectories can be characterized by simpler language operations. 

Lemma 4. Let (fx of the operations Mt’, A^, \>t- Then there exists a 

finite substitution hi, morphisms h 2 , g and a regular language R such that for 
all languages Li,L 2 Q S*, and for all sets of trajectories T C V*, 

L\ (jx T 2 = g{{hi{Li) LLI /i2(L2) LLI T) n i?). (3) 

Proof. Let Si = {oi \ a G S}, for f = 1, 2, 3, be copies of S such that S, Si, S 2 , 
S 3 and V are pairwise disjoint alphabets. For a letter a G 27, we denote by ai 
the corresponding letter from Si, i = 1, 2, 3. 

Let further hi : S — > (27iU273) be a finite substitution and let 7i2 : S — > S"^ 
and g : (Si U 17^ U S^ U V) — > S be morphisms. 

(i) If {fx ='^r, then define hi(a) = {ai, 03}, h 2 (a) = 02 for each a € S. Let 

R=(Si- {0} U {0362! I a, 6 G 27, a yf b})* . 

Let further g(ai) = a, 5(02) = a, for all oi G 27i, 02 G S 2 , and g(x) = A for 
all a; G 273 U y. Then one can easily verify that (3) holds true. 
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(ii) If = Arp, then let hi{a) = {oi} U {03} • Si, /12(a) = 02 for each a G S. 
Let further 



R = {Si ■ { 0 } U {030261! I o, 6 G S,a^ 6})*, 

and g{ai) = a for all oi G L'l, g{x) = A for all x € S2U S^UV. 

(ill) If <()rp = [> 2 -, then define hi{a) = oi, /12(a) = {02, 03} for each a G S. Let 

R = ({oiO20 I a G S} U {0163! I o, 6 G S,a^ 6})*, 

and g{a-i) = a for all 03 G S3, g{x) = A for all x G SiU S2UV. 

□ 

The previous lemmata allow us to make statements about closure properties 
of the substitution operations now. 

Theorem 1. For a set of trajectories T C V*, the following three statements 
are equivalent. 

(i) T is a regular language. 

(ii) Li Mt’ L2 is a regular language for all Li,L2 C S* . 

(Hi) Li Arp L2 is a regular language for all Li, L2 C S*. 

Proof. The implications (i) ^ (ii) and (i) ^ (iii) follow by Lemma 4 due to the 
closure of the class of regular languages with respect to shuffle, finite substitution, 
morphisms and intersection. 

To show the implication (ii) (i), assume that Li L2 is a regular lan- 
guage for all Li,L2 C S*. Let a,b G S without loss of generality, then also 
L = a* M-p 6* is a regular language, and T = (f)~^{L), (j being the coding defined 

in the proof of Lemma 3. Consequently, T is regular. The implication (iii) (i) 

can be shown analogously. □ 

Theorem 2. For all regular set of trajectories T C V* and regular languages 
Li,L2 C S*, Li \>t L2 is a regular language. 

Proof. The same as the proof of Theorem 1, (i) (ii). □ 

Theorem 3. Let <)>p be any of the operations Mp, Ap, l>p. 

(i) Let any two of the languages Li, L2, T he regular and the third one he 
context-free. Then Li (jrp L2 is a context-free language. 

(ii) Let any two of the languages Li, L2, T he context-free and the third one he 
regular. Then Li <)>p L2 is a non- context-free language for some triples (L\, 
L2, T). 

Proof, (i) Follows by Lemmata 4, and by closure of the class of context-free 
languages with respect to finite substitution, shuffle, morphisms and inter- 
section with regular languages. 

(ii) Consider the alphabet S = {a,b,c,d}. 

1. Let “v^p =Np . 
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(1) Consider Li = {a”d6^” | n > 0}, L 2 = {a™c™ | to > 0} and T = V*, 
then (Li L 2 ) C a*da*c* = a”da”c”. 

(2) Consider Li = {o”6^" | n > 0}, L 2 = c+ and T = | to > 0}, 

then Li Mj- L 2 = 

(3) Consider Li = a+, L 2 = {6”c” | n > 0} and T = | to > 0}, 

then Li Mj- L 2 = a^h^d^. 

2. Let Or = • Consider: 

(1) Li = {a”6o^6a* | A: + ^ + 1 = 2n > 0}, L 2 = {a™6o’”+^ | to > 0} and 
T = 0+1+, 

(2) Li = {a"6"a* | n > 0}, La = 0 + and T = 02™+il'", 

(3) Li = a+6a+, La = {a"6a" | n > 0} and L = {0™l2"‘+i | to > 0}, 

then in all three cases (Li Aj, La) C a*b*ab* = a"5"a6”. 

3. Let = I>T- Consider: 

(1) Li = {c2™dc”"a* I TO > 0}, La = {o”6”do* | n > 0} and T = C+, 

(2) Li = {6"a"d6+o* |n > 0}, La = a+6+fia+ and L= | to > 

0}, 

(3) Li = c+dc+a*, La = {o”6"(io* | n > 0} and T = {l2""01™0* | to > 
0}, 

then in all three cases (Li 1>t La) C {a,b}* = a"6”o”. 

In all the above cases we have shown that Li La is a non-context-free 
language. 

□ 



6 Decision Problems 

In this section we study three elementary types of decision problems for language 
equations of the form Li <)>j^ La = R, where is one of the operations A^, 
[>T- These problems, studied already for various binary word operations in [7, 
6, 1, 10,5] and others, are stated as follows. First, given Li, La and R, one asks 
whether the above equation holds true. Second, the existence of a solution Li 
to the equation is questioned, when L\ is unknown (the left operand problem). 
Third, the same problem is stated for the right operand La. All these problems 
have their variants when one of Li, La (the unknown language in the case of the 
operand problems) consists of a single word. 

We focus now on the case when Li, La and T are all regular languages. 
Then Li La is also a regular language by Theorems 1, 2, being any of the 
operations Mt’, A-p, [>t- Immediately we obtain the following result. 

Theorem 4. The following problems are both decidable if the operation (fx 
one of Mp, A^, I>t, T being a regular set of trajectories: 

(i) For given regular languages Li, La, R, is Li T 2 = R'l 
(a) For given regular languages Li, R and a word w € S*, is Li (jx w = R? 

Also the decidability of the left and the right operand problems for languages 
are straightforward consequences of the results in Section 5 and some previously 
known facts about language equations [7]. 
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Theorem 5. Let one of the operations Arp, \>t- The problem “Does 

there exist a solution X to the equation Xf)rpL = R?” (left-operand problem) 
is decidable for regular languages L, R and a regular set of trajectories T. 

Proof. Due to [7], if a solution to the equation X (jrp L = R exists, then also 
-^max = (7?'^ is also a solution, (jrp being an invertible binary word oper- 

ation. In fact, Xmax is the maximum (with respect to the subset relation) of all 
the sets X such that X L C R. We can conclude that a solution X exists iff 

{R“ LY Yt L = R. (4) 

holds. Observe that if jfx is one of My, Arp, \>x, then fjip is A^, My or A^, 
respectively, by Lemma 2. Hence the left side of the equation (4) represents 
an effectively constructible regular language by Theorems 1, 2. Consequently, 
the validity of (4) is decidable and moreover the maximal solution Xmax = 
(i?'^ -(fx L)“ can be effectively found if one exists. □ 

Theorem 6. Let fjx one of the operations Ax, >t- The problem “Does 
there exist a solution X to the equation L(jx^ = -R?” (right- operand problem) 
is decidable for regular languages L, R and a regular set of trajectories T. 

Proof. Similarly as in the proof of Theorem 5, a maximal solution to the equation 
Lj)x^ = i? is Xniax = {LYtR'^Y ^ binary word operation j)x, see [7]. 
Hence a solution X exists iff 



L<jx{L<j^xR"Y = R ( 5 ) 

By Lemma 2, if (jx is one of A^, |>t, then (jtp is I>t, >'x or respectively. 
Again the validity of (5) is effectively decidable by Theorems 1, 2, and, moreover, 
an eventual maximal solution Amax = {LYt R'^Y oan be effectively found. □ 

The situation is a bit different in the case when the existence of a singleton 
solution to the left or the right operand problem is questioned. Another proof 
technique takes place. 

Theorem 7. Let fjx be one of the operations Ax, >t- The problem “Does 
there exist a word w such that w (fx R = R?” decidable for regular languages 

L, R and a regular set of trajectories T. 

Proof. Assume that fjx is one of Ax, 1 >t- Observe first that if y G wfjx^ 
for some w,x,y G S* , then |y| < |w|. Therefore, if R is infinite, then there 
cannot exist a solution w of a finite length satisfying w (fx L = R. Hence for an 
infinite R the problem is trivial. 

Assume now that R is finite. As shown in [7], the regular set Amax = 
(i?'^ (jip L)“ is the maximal set with the property A (jx L C R. Hence w is a 
solution of w j)x L = R iS 

(i) w (fx L C R, i.e. w G Amaxi and 

(ii) w fjx L (fi R. 
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Moreover, (ii) is satisfied iff w (^■rp L Ri for all Ri C R, and hence w ^ 

{R1 ^^rpLy. Hence we can conclude that the set S of all singleton solutions to 
the equation w L = R can be expressed as 

S = {R^OlrLr- U (RIOtLY- 

RiCR 

Since we assume that R is finite, the set S is regular and effectively constructible 
by Lemma 2, Theorems 1, 2 and the closure of REG under finite union and 
complement. Hence it is also decidable whether S is empty or not, and eventually 
all its elements can be effectively listed. □ 

Theorem 8. Let Yx one of the operations A^, [>t- The problem “Does 
there exist a word w such that = R?” is decidable for regular languages 

L, R and a regular set of trajectories T. 

Proof. Assume first that {fx is one of A-p . Observe that if y G x{)x w for 
some w,x,y G S* , then |y| > |w|. Therefore, if a solution w to the equation 
L(jxw = R exists, then |w| < k, where k = min{|y| | y G R}. Hence, to 
verify whether a solution exists or not, it suffices to test all the words from 
r°U A^U-'-U 

Focus now on the operation l>p. Analogously to the case of Theorem 7, we 
can deduce that there is no word w satisfying L [>r w = R, if R is infinite. 
Furthermore, the set A^ax = {L l>p R'^Y = {L Mp RY“ is the maximal set with 
the property L l>p X C R. The same arguments as in the proof of Theorem 7 
allow one to express the set of all singleton solutions as 

S = {L^xRT- U {LMxRIY- 

Ri<ZR 

For a finite R, the set S is regular and effectively constructible, hence we can 
decide whether it contains at least one solution. □ 

We add that in the above cases of the left and the right operand problems, 
if there exists a solution, then at least one can be effectively found. Moreover, in 
the case of their singleton variants, all the singleton solutions can be effectively 
enumerated. 

7 Applications 

In this section we discuss a few applications of the substitution-on-trajectories 
operation in modelling certain noisy channels and a cryptanalysis problem. In 
the former case, we revisit a decidability question involving the property of error- 
detection. 

For positive integers m and I, with m < I, consider the SID channel [12] 
that permits at most m substitution errors in any I (or less) consecutive symbols 
of any input message. Using the operation Mp, this channel is defined as the 
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set of pairs of words (u,v) such that u is in f ixij- U*, where T is the set of all 
trajectories t such that, for any subword s of t, if |s| < I then |s|i < m. In general, 
following the notation of [ 8 ], for any trajectory set T we shall denote by [Mt’ E*] 
the channel {(m, w) | f G m Nt’ if*}. In the context of noisy channels, the concept 
of error-detection is fundamental [13]. A language L is called error- detecting for 
a channel 7 , if 7 cannot transform a word in L\ to another word in L\; that is, 
if G L\ and (u,u) G 7 then u = v. Here L\ is the language L U {A}. The 
empty word in this definition is needed in case the channel permits symbols to 
be inserted into, or deleted from, messages - see [13] for details. In our case, 
where only substitution errors are permitted, the above definition remains valid 
if we replace L\ with L. 

In [13] it is shown that, given a rational relation 7 and a regular language 
L, we can decide in polynomial time whether L is error-detecting for 7 . Here we 
take advantage of the fact that the channels [Nt’ A7*] permit only substitution 
errors and improve the time complexity of the above result. 



Theorem 9. The following problem is decidable in time O(jAj^jTj). 

Input: NFA A over E and NFA T over {0, 1}. 

Output: Y/N, depending on whether L{A) is error- detecting for [M^p A*]. 



Proof. In [9] it is shown that given an NFA A, one can construct the NFA A'^, 
in time O(jAj^), such that the alphabet of A'^ is E = E x E and the language 
accepted by consists of all the words of the form (xi,yi) ■ ■ ■ (x„, Vn), with each 

{xi, Vi) G E, such that xi ■ • • Xn ^ yi • • • Vn and the words xi - • • Xn and y\ - • • y-a 
are in L{A). Let 4> be the morphism of E into {0, 1} such that <j>{x,y) = 0 iff 
X = y. One can verify that L{A) is error-detecting for [Mp^ A*] iff the language 
4>{L{A'^))r)L{T) is empty. Using this observation, the required algorithm consists 
of the following steps: (i) Construct the NFA A'^ from A. (ii) Construct the NFA 
(p{A'^) by simply replacing each transition s{x,y) ^ t of with s<f>{x,y) t. 
(iii) Use a product construction on <j){A'^) and T to obtain an NFA B accepting 
4>{L{A'^y) n L{T). (iv) Perform a depth first search algorithm on the graph of B 
to test whether there is a path from the start state to a final state. □ 

We close this section with a cryptanalysis application of the operation Mp. 
Let U be a set of candidate binary messages (words over {0, 1}) and let K he a, 
set of possible binary keys. An unknown message w in U is encrypted as w 0 t, 
where t is an unknown key in K, and 0 is the exclusive-OR logic operation. 
Let e be an observed encrypted message and let T be a set of possible guesses 
for t, with T C K. We want to find the subset A of U for which A 0 T = e, 
that is, the possible original messages that can be encrypted as e using the keys 
we have guessed in T. In general T can be infinite and given, for instance, by a 
regular expression describing the possible pattern of the key. We can model this 
problem using the following observation whose proof is based on the definitions 
of the operations Np and 0, and is left to the reader. 
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Lemma 5. For every word v and trajectory t, v S* = {u 0 t}. 

By the above lemma, we have that the equation X (B T = e is equivalent 
to X IxIt^ E* = e. By Theorem 5, we can decide whether there is a solution for 
this equation and, in this case, find the maximal solution X^^^. In particular, 
-^max = (e°ATL’*)°. Hence, one needs to compute the set Most likely, 

for a general T, this problem is intractable. On the other hand, this method 
provides an alternate way to approach the problem. 
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Abstract. We study biological recombination from language-theoretic 
and machine learning point of view. Two generative systems to model 
recombinations are introduced and polynomial-time algorithms for their 
language membership, parsing and equivalence problems are described. 
Another polynomial-time algorithm is given for hnding a small model for 
a given set of recombinants. 



1 Introduction 

Recombination is one of the main mechanisms producing genetic variation. Sim- 
ply stated, recombination refers to the process in which the DNA molecules of 
a father chromosome and a mother chromosome get entangled and then split 
off to produce the DNA of the child chromosome, composed of segments taken 
alternately from the father DNA and the mother DNA (Fig 1) [1]. 




Fig. 1. Recombination 



Combinatorial structures created by iterated recombinations have attracted 
lots of interest recently. The discovery of so-called haplotype blocks [3, 5] has also 
inspired the development of new efficient algorithms for the analysis of structural 
regularities of the DNA, from various perspectives; e.g. [9,4]. Some methods for 
genetic mapping such as the recent approach of [7] also model recombinations. 

In this paper we study recombination from language-theoretic and machine 
learning point of view. Two simple systems are introduced to generate recombi- 
nants starting from certain founding strings. Membership, parsing and equiva- 
lence problems for these systems turn out in general easy. More interesting and 
also much harder is the problem of inverting recombinations: given a sample set 
of recombinants we want to construct a smallest possible system generating a 
language that contains the sample. 
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The paper is organized as follows. Section 2 introduces simple recombination 
systems. Such a system is specified just by giving a set of strings, the “founders” 
of a population. Section 3 introduces another system, called the fragmentation 
model, in which the strings that can be used as segments of recombinants are 
listed explicitly. Language membership, parsing and equivalence problems for 
these two systems are polynomial-time solvable, by well-known techniques from 
finite automata [6] and string matching [2]. In Section 4 we consider a ma- 
chine learning type of problem of constructing a good fragmentation model for 
a sample set of recombinants. We give a polynomial-time algorithm that finds 
a smallest model in a special case. Also in the general case the algorithm seems 
useful although the result is not necessarily minimal. 

2 Simple Recombination Systems 

A recombination is an operation that takes two strings u and v of equal length 
n and produces a new string w, also of length n, called a recombinant of u and 
w, such that 

w = xy 

where a; is a prefix of u and y is a suffix of u or a: is a prefix of v and y is a 
suffix of u. The recombinant w is said to have a cross-over at location |a;|. For 
simplicity we assume that a recombinant may have only one cross-over. As x or 
y may be the empty string, u and v themselves are recombinants of u and v. 

Let A be a set of strings of length n. The set of strings generated from A in 
one recombination step is denoted 

1R(A) = {w I w is a recombinant of some u,v G A\. 

Let A be a finite alphabet. A simple m x n recombination system in A is 
defined by a set F C A” consisting of m strings of length n in A. The strings 
in F are called the founders of the system. System F generates new sequences 
by iterating the recombination operation. The generative process has a natural 
division into generations giving the corresponding languages Go{F),Gi{F), . . . 
as follows: 



Go (A) = F 
Gi{F)=tR{F) 

Gi(F) = Dl(G,_i(A)). 



As Go{F) C Gi{F) C • • • C Gi{F) C • • • C A” there must be j such that after 
the jth generation nothing new can be produced, that is, Gj'{F) = Gj{F) for 
all j' > j- We call L{F) = Gj{F) the full recombinant language of system F. 
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Example 1. Let S = {0, 1}, n = 4, and consider 2x4 system F = {0000, 0111}. 
Then Gi(F) = (0000, 0111, 0100, 0110, 0011, 0001} and G 2 (F) = 

{0000, 0111, 0100, 0110, 0011, 0001, 0101, 0010}. Language G 2 {F) consists of 
all strings in E* that start with 0. This is also the full language L{F). 

It should be obvious that w is in L{F) if and only if w can be written as 

w = a\a 2 • • • ctp (1) 

for some non-empty strings ai G E~^ such that each occurs in some founder 
string fj G F at the same location as in w. That is, we have fj = jai6 for some 
7 such that jyj = \ai • • • ai_i|. Each decomposition (1) of w into fragments ai is 
called a parse of w with respect to F. 

String w may have several different parses. Two of them are of special interest. 
First, if w has some parse (1) then it also has a parse such that \ai\ = 1 for all 
t = 1, 2, . . . , p and p = n. We then note that a string wiW 2 ■ • -Wn, where Wi G E, 
belongs to L{F) if and only if for each Wi there is some fj G F whose zth symbol 
is Wi- Let us denote by Ei the symbols in E that occur at the zth location of 
some string in F. We call Ei the local alphabet of F at z. Summarized we get 
the following simple result. 

Theorem 1. L{F) = EiE 2 - • ■ En □ 

This immediately gives a language equivalence test for recombination sys- 
tems. Let E and F be two recombination systems of length n, and let 77i , II 2 , ■ ■ ■ , 
Un be the local alphabets of E and Ei, E 2 , ■ ■ ■ , En the local alphabets of F. Then 
L{E) = L{F) if and only if 77^ = Ei for all z = 1, 2, . . . , rz. So, for example sys- 
tems {0000, 1111} and {0101, 1010} are equivalent as all local alphabets are 
equal to {0, 1}. 

The simplicity of the equivalence test also indicates that the sequential struc- 
ture of the founders has totally disappeared in L{F). Therefore it is more in- 
teresting to look at strings that have a parse consisting of a small number of 
fragments ai. This leads us to define the canonical parses. 

Let w G L{F). Then a parse w = a\a 2 - --ap of w with respect to F is 
canonical, if 

1. p is smallest possible; and 

2. among parses of w with p fragments, each |aiOf 2 • • • azl, 1 < z < p, is largest 

possible. 

A canonical parse of w is easily seen unique. It can be found by the following 
greedy parsing algorithm. First find the longest prefix of w that is also a prefix of 
some string in F . This prefix is fragment ai of the canonical parse. Then remove 
|ai| symbols long prefix from w and from all members of F. Repeat the same 
steps to find longest prefix that becomes 02 , and so on, until the entire w has 
been processed or it turns out that parsing can not be continued to the end of 
w, in which case w ^ L{F). 

We will use the number p — 1 of the cross-overs in the canonical parse as 
a distance measure for strings: the recombination distance p{w, F) between w 
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and F is p — 1, the smallest possible number of cross-overs in a parse of w with 
respect to F. Note that p{w,F) < n — 1 for all w G L{F). If w ^ L{F), we let 
p{w, F) = oo. 

The greedy parsing algorithm finds p(w,F). The algorithm works without 
any preprocessing of F and can be organized to run in time 0{mn), i.e. linear 
in the total length of the strings in F. We now describe a preprocessing of F 
which constructs a collection of trie structures such that canonical parsing of 
any string with respect of F can be done in optimal time 0(n). 

In the canonical parsing by the greedy method one has to find the longest 
prefix of the current suffix of w that is common with the corresponding suffix of 
some founder f € F. Let w* = WiWi+i • • -Wn and /j = fjifji+i ■ ■ ■ fjn denote the 
fth suffixes of w and the founders, and let T* denote the trie representing strings 
/i) /21 ■ ■ ■ 5 /m- The longest common prefix, that will become the first fragment 
of the parse, can be found by traversing the path of for until a symbol of 

is encountered, say Wh, that is not present in (or ends). The scanned 
prefix is the fragment «i of the canonical parse. We needed 0(|ai|) time to find 
it this way. The parsing continues by next traversing the path of for w^, 
giving 02 in time O(|o 2 |), and so on. 

To make this work we need the tries T^, . . . , T”. A straightforward con- 

struction of a trie for m = \F\ strings of length n takes time 0{mn) assuming 
that |A| is constant. Hence the total time for all tries would be 0{mn'^). We 
next describe a suffix-tree based technique for constructing these tries in time 
0{mn). 

The suffix-tree of a string a; is a (compacted) trie representing all the suffixes 
of X. The size of the tree is 0(|a;|), and it can be constructed in time 0(|a;|) 
by several alternative algorithms [2]. To get the tries that form our parser 
for F we first augment the founder strings with explicit location indices, such 
that founder string /j = /a/j2"-/m becomes /* = (/a, l)(/i 2 , 2) • • • (/*„, n). 
Now construct the suffix-tree T for string / = / 1/2 • • • /m- Then trie consists 
of the subtrees of T representing suffixes that start with symbols (a,h), where 
a G E. Hence tries can be extracted from T in one scan through the edges 
that are adjacent to the root. 

This construction can be performed in 0{mn) time, i.e., linear time in the 
length of / although we have formally used alphabet of non-constant size \E\n. 
This is because the non-root nodes of T may only have lAI branches and hence 
the branching degree at such nodes does not depend on n. While the root node 
can have 0{\E\n) branches, the dependency on n can be made constant by direct 
indexing (or bucketing) on the second component of a symbol. 

Finally note that the tries extracted from T are of compacted form, i.e., 
the non-branching nodes of the trie are represented only implicitly. The edges of 
a compacted trie correspond to strings (instead of single symbols), represented 
by pairs of pointers to the original strings in F'. In our greedy parsing algorithm 
such tries can be used as well, without significant overhead. 
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Theorem 2. Given an m x n recombination system F , a greedy parser for F 
can he constructed in time 0{mn). For any string w, the parser computes in time 
0{n) the canonical parse of w with respect to F and the recombination distance 
p{w,F). □ 

Canonical parsing is not the only possible use of the parser of Theorem 2. 
All possible parses of w can be generated if, instead of greedily traversing the 
current trie as far as possible, the parsing jumps to the next trie at any point 
on the way. It is also possible to check whether or not w has a parse with given 
cross-over points: Then the parsing should jump to the next trie exactly on these 
points. The parses can also be utilized to find a string w with largest possible 
distance p(w, F). 

3 Generalized Recombination Systems and Fragmentation 
Models 

Parsing a string as introduced in the previous section means decomposing the 
string into fragments taken from the founders. The available fragments were 
implicitly defined by the founders: any substring of a founder can be used in a 
parse. 

We now go one step further and introduce models in which the available 
fragments are listed explicitly. 

A fragmentation model of length n in alphabet 17 is a state-transition system 
M = {S, Q, S, n) consisting of a finite set S of the states and a set Q of tran- 
sitions. Each s G S' is a pair {i,v), where i is an integer 1 < z < n, and string 
V G S* is the fragment of the state such that |u| < rz — z -I- 1. We call b{s) = i the 
begin location and b{s) = z -I- |f | the end location of s. A state s is a start state 
if b{s) = 1 and an end state if e(s) = n -|- 1. The transition set Q is any subset 
of S X S such that if (r, s) G Q then e(r) = b{s), that is, the location intervals 
covered by r and s should be next to each other. 

The language L{M) of M consists of all strings generated along the transition 
paths from a start state to an end state. More formally, e G L{M) if and only if 
there are states si, S 2 , . . . Sp such that (si, Si+i) G Q for 1 < z < p, s is a start 
state and Sp is an end state, and w = V\V 2 - • -Vp where Vi is the fragment of 
state Si- Note that all w G L{M) are of length n. Also note that fragmentation 
models are a subclass of finite-state automata. Hence for example their language 
equivalence is solvable by standard methods [6] . 

Example 2. A simple m x n recombination system F of the previous section 
consisting of m founders fj = fjifj 2 • • • fjn can be represented as a fragmentation 
model M = {S,Q, S,n) as follows: set S consists of all states (i,v) where 1 < 
i < n and v = • • • fjh for some I < j < m and i < h < n. The transition 

(r, s) is included into Q for all r, s such that e(r) = 6(s). Note that M is much 
larger than F. It has 0{mn^) states and transitions, and the fragments 

of the states have total length 0{mn^). □ 
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Each transition path gives for the generated string a parse into fragments. 
Different parses for the same string can be efficiently enumerated and analyzed 
for example by using dynamic programming combined with breadth-first traver- 
sal of the transition graph of M. We describe next an algorithm for finding a 
parse with smallest number of fragments, i.e., a shortest path through M that 
generates the string to be parsed. 

Let w be the string to be parsed. We say that state {i, v) of M matches w if 
w = xvy where |a:| = i — 1. We associate with each state s a counter c(s) whose 
value will be the length of a shortest path to s that will generate the prefix of 
length b{s) — I of w. Variable P will be used to store the length of a shortest 
parse. The parsing algorithm is as follows: 

1. Let si, S 2 . . . St be the states of M ordered according to increasing value of 
e(sj) 

2. Initialize the counters 

P ^ oo 

, , Jo , if Sj is a start state 
\ oo , otherwise 

3. for j ^ 1, 2, . . . , t do 

if c{sj) < oo and Sj matches w then 
if Sj is an end state then 
P ^ min(P, c{sj) + 1) 
else 

for all Sfc such that (sj,Sk) G Q 
c(sfe) ^ min(c(sfe),c(sj) -|- 1) 



The algorithm can be implemented such that the running time is linear in 
the size of M. We also observe that testing whether or not some states of M and 
w match can be done very fast by first constructing Aho-Corasick multi-pattern 
matching automaton [2] for the fragments of the states and then scanning w 
with this automaton. 



4 Model Reconstruction Problems 

The language membership and equivalence well as parsing problems for recom- 
bination systems turned out solvable by fast algorithms, not unexpectedly as 
we are dealing with a limited subclass of the regular languages. We now discuss 
much harder problems concerning inversion of recombinations. 

Given a set D of strings of length n we want to find a model that could have 
generated D. This question was addressed in [8] in the case of simple recombi- 
nation systems. For example, an algorithm was given in [8] that constructs an 
m X n recombination system F such that D C L{F) and the average recombi- 
nation distance of the elements of D from F is minimized. Here we will consider 
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the problem of finding fragmentation models for D. The fragments of such a 
model can be thought to represent the “conserved” substrings of D. 

The goodness of a fragmentation model M for set D can be evaluated using 
various criteria. A possibility is to consider probabilistic generalizations of frag- 
mentation models and apply model selection criteria such as the Minimum De- 
scription Length principle. We will resort to combinatorial approach and consider 
the following parsimony criterion: find a fragmentation model M = {S, Q, S , n) 
such that D C L{M) and the number of states in Q is smallest possible. We call 
this the minimal fragmentation model reconstruction problem. 

Example 3. Let D consist of strings 

000011 

000111 

001011 

000000 

0 0 0 1 0 0 ( 2 ) 
111011 
110000 
110100 
111000 

By taking the strings in D as such (and nothing else) as the fragments we get a 
fragmentation model which generates exactly D and has 9 states. However, the 
model depicted in Fig 2 has only 7 states. It generates a language that properly 
contains D. □ 




Fig. 2. A fragmentation model (begin locations of states not explicitly shown) 



We rephrase now the minimal fragmentation model reconstruction in terms 
of certain tilings of D. Let us refer to the m strings in D by 1, 2, . . . , m; the zth 
string is dndi 2 ■ ■ ■ dm- Then any triple t = (A, h, k), where A C {1,2,..., m} 
and h and k are integers such that l</i</c<m, isa tile of D. Set A is the 
row-set of r. The tile t covers all substrings dihdih+i ■ ■ ■ dik where i € A. The 
tile is uniform if all substrings it covers are equal, i.e., dih • • • dik = djh ■ ■ ■ djk 
for all i,j G A. A set T of tiles of D is a uniform tiling of D if the tiles in T are 
uniform and disjoint and cover D, i.e., for each dij there is exactly one tile in T 
that covers dij. 
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A fragmentation model M such that D C L{M) induces a uniform tiling 
T{M) of D as follows. Fix for each string d G D a path of M that spells out 
d. For any state s = (z, v) of M, let A be the set of strings in D whose path 
goes through s. Then add the tile (A, z,z + |z;| — 1) to T{M). It should be clear 
that T{M) is a uniform tiling of D. Note that there are several different tilings 
T{M) if some d G D is ambiguous with respect to M, i.e., if M has more than 
one path for d. 

On the other hand, given a uniform tiling T of D, one may construct a 
fragmentation model M{T) as follows. For each tile {A, h, k) G T, add to M{T) 
a state s = {h,v) where v = dih • • • dik for some i G A. Also add a transition 
(s, s') to M{T) if the tiles (A, h, k) and (A', h' , k') in T that correspond to s and 
s' are such that row-set intersection A n A' is nonempty and k + 1 = h' . 

As the number of states of M{T) equals the number of tiles in T, and the 
number of tiles in T{M) is at most the number of states of M, we get the 
following result. 

Proposition 1. The number of states of the smallest fragmentation model M 
such that D C L{M) equals the number of tiles in the smallest uniform tiling 
ofD. 

To solve the minimal fragmentation model reconstruction we will construct 
small uniform tilings for D. We will proceed in two main steps. First a rather 
simple dynamic programming algorithm is given to find optimal tilings in a 
subclass called the column-structured tilings. In the second step we apply certain 
local transformations to further improve the solution. 

A uniform tiling of D is column- structured if the tiles cover D in columns: for 
each two tiles (A, h, k) and (A', h' , k'), ii h= h' then k = k' . The corresponding 
class of fragmentation models (models whose fragments with the same begin 
location are of equal length) is also called column-structured models. If a column- 
structured tiling is smallest possible, then the number of tiles in each column 
should obviously be minimal. Such minimal tiling for a column is easy to find 
as follows. Consider set D{h, k) consisting of strings dih - • • dik for 1 < z < to. 
Let Ai,A 2 ,...Ap be the partition of {1,2, ...,to} such that z and j belong 
to the same class A^ if and only if dih- ■■ dik = djh---djk- Then the tiling 

(^{Ai,h,k), . . . , (Ap,h,k)^ of D{h,k) is uniform and has the smallest possible 
number of tiles among tilings whose tiles are from h to k. We denote this tiling 
by t{h, k) and its size p by a{h, k). 

Let S{j) be the size of smallest column-structured tiling of £)(1, j). Then 
S{j) can be evaluated for j = 0, 1, . . . , rz from 



r 5(0) = 0 

\ S{j) = min,<j (^S{i) -G a{i -G 1, j)) 

and S{n) gives the size of smallest column-structured uniform tiling of entire 
D. The usual trace-back of dynamic programming can be used to find the end 
locations ji, J 2 , ■ ■ ■ , jq = n oi the corresponding columns. Then the smallest 
tiling itself is t(l, ji) U t{ji -G IJ 2 ) U • • • U t{jg-i -G l,n). 
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Evaluation of (3) takes time O(n^) plus the time for evaluating tables cr and 
t which can be accomplished in time 0{n^m) using straightforward trie-based 
techniques. We have obtained the following theorem. 

Theorem 3. Minimal column-structured fragmentation model for D can he con- 
structed in time 0{n^m) where n is the length and m the number of strings in D. 

Example 4- The fragmentation model in Fig 2 for the set (2) of Example 3 is 
column-structured and minimal. If string 011010 is added to (3), then algorithm 
(3) will give the column-structured model in Fig 3(a). However, the model in 
Fig 3(b) is smaller. □ 




Fig. 3. (a) A column-structured fragmentation model (b) A smaller model 



The tilings given by the column-wise approach can further be improved by 
applying local transformations. The transformations use the following basic step. 
Assume that our current tiling has adjacent tiles {A, h^k — 1) and {B, k, r). We 
may replace these tiles by the tiles 

{A n B, h, r), 

{A\B,h,k- 1), 

(B\A,k,r), 

and the tiling stays uniform and covers still the entire D. The replacement 
operation has no effect if row-set A n i? is empty. Otherwise it changes the 
structure of the tiling, li A = B, the number of tiles is reduced by one; ii AQ B 
or B C A, the number stays the same; and ii A O B, A \ B and B \ A are all 
non-empty, the number increases by one. 

Given any tiling T we can improve it by the following iterative reduction 
rule: apply the above local transformation on any pair of tiles {A, h,k — 1) and 
(B,k,r) such that A C B or B C A (i.e., transformation does not increase 
the number of tiles). Repeat this until the local transformation is not any more 
applicable. It is easy to see that the process stops in 0{mn) iterations. Note 
that the seemingly useless transformation steps that do not make the number 
of tiles smaller are indirectly helpful: they make the tiles narrower (and longer) 
and hence may create possibility for true size reduction in the next step. 

There are two possible ways to include the reduction step into algorithm 
(3). On can apply it only on the final result of (3). This would, for example. 
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improve the tiling in Fig 3(a) into that in Fig 3(b). Other possibility is to ap- 
ply the reduction rule also on the intermediate tiling obtained for each D(l,j) 
during algorithm (3) and to use the reduced tiling in subsequent computation. 
Sometimes this strategy will give better results than the previous one. 

There is also another local transformation that makes the tiles longer and 
narrower without reducing the number of tiles. This transformation eliminates 
certain loop- like structures from the tiling, defined as follows. The inclusion 
graph of a tiling T at j is a bipartite graph which has as its nodes all tiles {A, h, k) 
and {A' , h' , k') such that k = j — 1 and h' = j and as its (undirected) arcs all 
^(A, h,j — 1), {A',j, fc')^ such that row-set intersection Af^ A' is not empty. A 
connected component of this graph is a simple loop if it contains as many nodes 
as arcs. In a simple loop every node has degree 2 (i.e., two arcs are adjacent to 
a node). The number of tiles in a simple loop equals the number of their non- 
empty pairwise row-set intersections. But this means that applying our local 
transformation on all such pairs will keep the number of tiles unchanged. Hence 
the loop-removal transformation can safely be added to the local transformations 
one should apply to make the tiling smaller. 

Summarized, we get an optimization algorithm that combines dynamic pro- 
gramming and local transformations. It finds a local optimum with respect to 
the local transformations. Running-time is polynomial in the size of D. 

5 Conclusion 

We introduced two simple language-generating systems inspired by the recom- 
bination mechanism of the nature. For the model reconstruction problem we de- 
lineated some initial results while many questions remained open, most notably 
the complexity status and approximability of the minimal model reconstruction. 
Probabilistic generalizations of our models are another interesting direction for 
further study. 
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Abstract. This paper contains algebraic aspects of Parikh matrices. We 
present new, but also some old results, concerning this topic. It is proved 
that in some cases the set of Parikh matrices is a noncommutative semir- 
ing with a unit element. Also we prove that the set of Parikh matrices is 
closed under the operation of shuffle on trajectories and thus it is closed 
under many other operations. It is presented also the notion of extended 
Parikh matrix that it is an extension of the notion of the Parikh matrix. 
The paper contains also a number of open problems. 



1 Introduction 

The Parikh vector is an important notion in the theory of formal languages. 
This notion was introduced in [11]. One of the important results concerning this 
notion is that the image by the Parikh mapping of a context-free language is 
always a semilinear set. (For details and ramifications, see [14].) The basic idea 
behind Parikh vectors is that properties of words are expressed as numerical 
properties of vectors. However, much information is lost in the transition from 
a word to a vector. 

In this paper we introduce a sharpening of the Parikh vector, where somewhat 
more information is preserved than in the original Parikh vector. The new notion 
is based on a certain type of matrices. All other entries above the main diagonal 
contain information about the order of letters in the original word. All matrices 
are triangular, with I’s on the main diagonal and O’s below it. 

We introduce also the notion of extended Parikh matrix. 

Two words with the same Parikh matrix always have the same Parikh vector, 
but the converse is not true. The exact meaning of the entries in a Parikh 
matrix is given below in Theorem 1. Our second main result, Theorem 2, shows 
an interesting interconnection between the inverse of a Parikh matrix and the 
Parikh matrix of the mirror image. 

We remind some basic notations and definitions. The set of all nonnegative 
integers is denoted by N . Let S be an alphabet. The set of all words over S is 
S* and the empty word is A. If w G S* then |w| denotes the length of w. 

We very often use “ordered” alphabets. An ordered alphabet is an alphabet 
A7 = {ai, 02 , . . .Ofc} with a relation of order (“<”) on it. If we have oi < 02 < 

• • • < Gk, then we use the notation 

S = {ai < 02 < • • • < Ofc}. 
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Let a G A' be a letter. The number of occurrences of o in a word w G E* is 
denoted by |tc|a- Let u, v be words over E. The word m is a scattered subword of v 
if there exists a word t such that v G uint, where lu denotes the shuffle operation. 
We now introduce a notation very important in our subsequent considerations. 

li u,v G E* , then the number of occurrences of m in u as a scattered subword 
is denoted by \v\scatt-u- For instance, 

\acbbb\scatt—ab — 3, \bf'Cbabb\scatt—ab — b and \aabbbc\scatt—abc — b. 

Thus, partially overlapping occurrences of a word as a scattered subword are 
counted as distinct occurrences. The number \v\scatt-u is denoted as a binomial 
coefficient in [13]. 

Let £■ = {ai < 02 < • • • < Ofe} be an ordered alphabet. The Parikh vector 
W \ E* ^ , is defined by 

W{w) = (|w|ai,|w|a2 ,---,k|aj- 

The Parikh vector of w is (jwjai, Iwjaa, • ■ • , Note that the Parikh vector is 

also a mapping W that is a morphism from the monoid {E*, •, A) to the monoid 
(iV^+,(0,0 ,..., 0)). 

The mirror image of a word w G E*, denoted mi{w), is defined as: mi{\) = A 
and mi{bib 2 ■ ■ ■ bn) = . . . 6261, where bi € E, 1 < i < n. 

The reader is referred to [12] as a comprehensive treatment on formal lan- 
guages and diverse background material. The most fundamental applications and 
interconnections of Parikh vectors with language theory are presented in [14] . 

2 Parikh Matrices and Extended Parikh Matrices 

In this paper we consider “triangle” matrices. A triangle matrix is a square 
matrix M = such that mij G N, for all 1 < i,j < k, mij = 0, for 

Sill I < j < i < k, and, moreover, mi^i = 1, for all 1 < i < fc. 

The set of all triangle matrices is denoted by M. The set of all triangle 
matrices of dimension fc > 1 is denoted by Mfe. Clearly, Mfe constitutes a monoid 
under matrix multiplication. 

Now we introduce the main notion of this paper. 

Definition 1 . Let E = {ai <02 < • • • < Ofe} be an ordered alphabet, where 
k>\. The Parikh matrix, denoted is the morphism: 

Tm, ■■ E* Mfc+i, 

defined by the condition: ifT'Mkio^q) = i'm-i,j)i<i,j<{k+i), then for each 1 < i < 
{k + 1), mi^i = 1, mq^q+i = 1, all other elements of the matrix TMk{aq) being 0. 

The notion of Parikh matrix was introduced in [6] and studied in [7, 8]. 

The Parikh matrix is not injective. One of the major open problems is to 
characterize non-injectivity, that is, to provide some natural conditions for two 
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words to possess the same Parikh matrix. This problem is closely linked with 
the fundamental problem about the information content of a Parikh matrix: how 
much does the Parikh matrix tell about a word? 

It was brought to our attention by one of the referees that the term Parikh 
matrix 'was used in [10] for the growth matrix of a morphism (or a DOL system). 

Now we present the notion of extended Parikh matrix. This important notion 
was introduced in [15]. 

Definition 2. Let S he an alphabet and u = b\ . . . 5|„| he a word in S* (bi G S 
for all I < i < \u\). The extended Parikh matrix induced by the word u over 
the alphabet S (shortly, the u-Parikh matrix^, denoted is the monoid 

morphism 

Ts,u : (^*,-5 A) ^ (M|„|+i,-,/|„|+i), 

defined by the condition: if a G S and then: 

( 1 iff = i 
rriij = < Sbi,a if j = i + l 
y 0 otherwise 

where 5 a, b is the Kronecker Symbol regarding letters, that is 

c _ j if o. = b 
ifa^b 

In the notation 'Ts,u, has to be mentioned because u can be considered 
over any alphabet which contains its letters, and we need to know the context 
we are working in. 

If S is known, then we will use the notation for especially in proofs, 

for reasons of simplicity. 

It is clear that if the symbol a G S doesn’t occur in u, then Ts,u{a) = -^|«|+i- 

Let u G S* . We say that M G is an extended Parikh matrix induced 

by u if there exists a word w G S* such that M = Ts,u{w). Generally, we say 
that M G Mfe+i is a Parikh matrix induced by a word if there exists a word 
u G S* such that |u| = k and M is a Parikh matrix induced by u. 

It’s easy to see that the Parikh matrix can be obtained as a particular case 
of this definition, when u contains all the symbols in S only once. The ordering 
of the alphabet is then given by the order in which the symbols appear in u. 

For example, if I? = {6i <62 < ... < 6^} is an ordered alphabet and u G S* , 
u = 6162 ■ ■ - bk, then it can be easily seen that for all a G S, Ts,k{o) = 

It follows that for all words w G S*, 

'Ts,k{w) = Te,u{w). 

Similarly, it follows that for all words w G S*, 
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Let us now give an example of an u-Parikh matrix computation. Let S = 
{a, 5} and u = aba. We will compute 'Ps,u{o,bba). 

We have that 'Ps^uiabba) = 'P s s ,u{b)'P s ,u{b)'P s ,u{a) , which leads to: 



Ps,u{abba) 



/I 1 0 0\ 
0 10 0 
0 0 11 
\0 0 0 1 / 



/I 0 0 0\ 
0 110 
0 0 10 
\0 0 0 1 / 



/I 0 0 0\ 
0 110 
0 0 10 
\0 0 0 1 / 



/I 1 0 0\ 
0 10 0 
0 0 11 
\0 0 0 1 / 



/I 2 2 2\ 
0 12 2 
0 0 12 
\0 0 0 1 / 



3 About the Entries of a Parikh Matrix 

In this section we characterize the entries of the Parikh matrix. We first introduce 
some notation that will be applied in our first theorem. Recall also the notation 
\v\scatt-u defined in Section 1. 

Consider the ordered alphabet S = {a\ < a^ < • • • < Ofc}, where A: > 1. We 
denote by a^j- the word OiOi+i . . .aj, where 1 < z < j < A:. 

We are now ready to prove the basic property of the Parikh matrix. 

Theorem 1 . Let S = {a\ < 02 < • • • < Ofc} be an ordered alphabet, where 

k > 1, and assume that w G S* . The matrix has 

the following properties: 

(i) rriij = 0 , for all 1 < j < i < {k + 1 ), 

(a) rrii^i = 1, for all 1 < i < {k + 1), 

(Hi) = \w\scatt-ai,j, for alll<i<j <k. 

Corollary 1 . The matrix pMkiw) has as the second diagonal (i.e., the vector 
(mp2,TO2,3, • ■ • ,Wfc,fc+i)J the Parikh vector ofw, i.e., (toi,2,TO2,3, • ■ • ,mk,k+i) = 

Comment The above results are not true for the extended Parikh matrix. 
One can easily find all this information as for instance the Parikh vector by a 
simple method. 

As already pointed out, the Parikh matrix gives more information about a 
word than the classical Parikh vector, although the Parikh matrix is still not 
injective. Injectivity would of course mean that the information given by Parikh 
matrices is complete. This would be more than one can reasonably hope for: one 
cannot expect that words could be expressed as matrices in this fashion, which 
would give all information in a simple numerical form. 

So far very little is known about sets of Parikh matrices associated to lan- 
guages belonging to a fixed family such as the families in the Chomsky hierarchy. 
The following remark shows that the semilinearity result of context-free lan- 
guages does not carry over to sets of matrices. Results concerning semilinearity 
and Parikh matrices are in [2]. The notions of slender and Parikh slender lan- 
guages are studied in [3-5]. Results about the injectivity of Parikh mapping can 
be found in [1]. 
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Remark 1. Consider the ordered alphabet {a < 6} and the context-free language 
L = {a”6”|n > 1}. Clearly, 

( 1 n \ 

0 1 n 
00 I ) 

Hence ^m 2 {L) cannot be a semilinear set (for any reasonable extension of the 
definition of semilinearity to matrices). 

Clearly, every triangle matrix is not a Parikh matrix of some word. For in- 
stance, the matrix 

1 2 7 
0 1 3 
0 0 1 

is not a Parikh matrix. This follows because ab occurs as a scattered subword at 
most 6 times in a word with the Parikh vector (2,3). 

The product of the entries in the Parikh vector constitutes an upper bound 
for the entry Thus, the size of the entry mi _3 in Remark 1 is maximal. 

Whether or not a given triangle matrix is a Parikh matrix is clearly a decidable 
question. 



4 On the Inverse of Parikh Matrices 

We investigate interrelations between the inverse of a Parikh matrix associated to 
a word w and the Parikh matrix of mi{w), the mirror image of w. Clearly, the set 
of all triangle matrices of order k > 2 with integer entries is a noncommutative 
group with respect to multiplication, the unit element being the unit matrix of 
order k. Consequently, for each Parikh matrix A, there exists the inverse matrix 
A-\ 

Definition 3. Let S = {ai < 02 < • • • < Ofc} be an ordered alphabet and 
let w € S* be a word. Assume that the Parikh matrix of w is 'pMk ("w^) = 
(TOi,i)i<ij<fc-i-i • The alternate Parikh matrix of w, denoted is the ma- 
trix where m'^j = (— for all 1 < i, j < k 1. 

Observe that the mapping T (w) is a morphism of S* . For the Parikh vector 
T and for every word w, Tiw) = T{mi(w)). However, for the Parikh matrix 
the situation is completely different. The next theorem reveals the interrelation 
between the inverse of the Parikh matrix of a word w and the alternate Parikh 
matrix of the mirror image of w. 

Theorem 2. Let S = {a\ < 02 < • • • < Ofc} be an ordered alphabet and let 
w G E* be a word. Then: 



[TmAw)] ^ = TMki.mi{w)). 
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Observe that Theorem 2 provides a very simple method to compute the 
inverse of a Parikh matrix. One can also apply it directly to matrices: inverses 
of matrices of a certain type can be computed in this way. 

As an example, consider the ordered alphabet S = {a < b < c} and assume 
that w = cbbaa. Then 



'I^M^icbbaa) = 



/I 2 0 0\ 
0 12 0 
0 0 11 
\0 0 0 1 / 

Since mi{cbbaa) = aabbc, we have by Theorem 2: 



[^Maicbbaa)] = ^ Msiaabbc) = 



/I -2 4-4\ 

0 1-22 
0 0 1-1 
\0 0 0 1/ 

A special relation between \w\scatt-ai,j and \mi{w)\scatt-aij is obtained in 
the next corollary. In the statement the last vertical bars stand for the absolute 
value. 

Corollary 2. Let E = {ai < Q 2 < • • • < ak} be an ordered alphabet and let w € 
E* be a word. Assume that the Parikh matrix of w is L'M^iw) = 
and that Then \mi{w)\scatt-ai,j = for 

all 1 < i, j < k. 

We consider now another method to compute the inverse of a Parikh matrix. 
We begin with some further definitions and notations. 

Let {A, <) be an ordered set. The dual order of the order <, denoted <°, is 
defined as: 

a <° 5 iff 6 < a. 

Let E = {ai < G 2 < • • • < afe} be an ordered alphabet. The dual ordered 
alphabet, denoted Eo, is Eo = {au < Ok-i < ■ • • < oi}. 

Consider the ordered alphabet E = {a\ <02 < ■ ■ ■ < Ok} and let w G A* be 
a word. The Parikh matrix associated to w with respect to the dual order on E 
is denoted by 

Let V = {v\,V2, ■ ■ ■ ,Vn) be a vector. The reverse of v, denoted is the 

vector = {vn, w„-i, ■ • ■ , vi). 

Now we introduce the notion of a reverse of a triangle matrix. Let M = 
{'niij)i<ij<n be a triangle matrix. The reverse of M, denoted is the 

matrix = (to' where m^ j = mn+i-j,n+i-i, for all 1 < f < j < 

n. (The entries on and below the main diagonal are the same in M and 

Note that is also a triangle matrix. An easy way to obtain is 

to reverse in M all diagonals that are parallel to the main diagonal. For instance, 
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/I 2 3 7\ 




/I 6 5 7\ 


0 14 5 


then = 


0 14 3 


0 0 16 


0 0 12 


\0 0 0 1^ 




i^O 0 0 1/ 



The reader can easily verify the following proposition. (Observe that Defini- 
tion 3 can be immediately extended to concern arbitrary matrices A.) 

Proposition 1. Let A, B be two triangle matrices of the same dimension. Then 

(i) [A(^^Ajirev) ^ 

(ii) 

(Hi) A = A. 

(iv) AB = A B. 

The next theorem gives another method of computing the inverse of a Parikh 
matrix. 

Theorem 3. Let B = {oi < 02 < • • • < ak} be an ordered alphabet and let 
w G E* be a word. Then 



The above Theorem 3 provides a simpler method to compute the inverse 
of a Parikh matrix. Here we have to reverse a matrix that is of a fixed size 
{card{E) + 1), whereas in the case of Theorem 2 we have to reverse the word w 
that can be arbitrarily long. 

From Theorems 2 and 3 we deduce: 

Corollary 3. Let S = {ai < 02 < • • • < ak} be an ordered alphabet and let 
w G E* be a word. Then: 

TM^{mi{w)) = 

The subsequent final observation concerning the functions introduced is rather 
obvious. Consider the following four functions from Mfc to Mfe: the identity /, the 
mapping — of A to A, the mapping (rev) of A to and the mapping (rev) 

of A to A . Then these four functions together with the operation of compo- 
sition constitute a group and, moreover, this is the well-known Four-Group of 
Klein. 

Comment Note that the above two methods to compute the inverse of a 
Parikh matrix does not work in the case of the extended Parikh matrix. 
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5 Some Algebraic Properties 

Next theorem is a direct consequence of the definition of Parikh matrix. 



Theorem 4. The entries rriij+i, l<i<j<k in a Parikh matrix 
satisfy the inequality 

+ l ^ j-l-1 . 

Concerning the minors of a Parikh matrix we can prove that: 



Theorem 5. The value of each minor of an arbitrary Parikh matrix is a non- 
negative integer. 



Comment The above result is true also for extended Parikh matrices. 

The following general inequality is a consequence of (extended) Parikh ma- 
trices. 



Theorem 6. The inequality |tc|a;yz|w|y < |w|a;y |tc|yz holds for arbitrary words 
w,x,y,z. 

A generalization of the above inequality is: 

Theorem 7. Consider an integer t and let w,x,yi,y 2 , ■ ■ - yt, z be arbitrary words. 
Then 

|w|yi . . . \w\y^ \w\xy.i^...ytz Cl I'it'Uyi I'it’lyiya ■ ■ ■ 

Next result concerns equality between sums of terms of the form |w|a;. 
Theorem 8. The equality 

kUl + \w\x-, + ■■■ \w\x„ = \w\y, -b \w\y^ -b . . . \w\y^ 

is decidable for all words w, X\,X 2 , ■ • ■ a;„, yi,y 2 , ■ . . ym- 

Open Problem It is not known the decidability of the inequality: 

\w\xi + \w\x-, + ■■■ \w\x^ < \w\y, -b \w\y^ -b . . . \w\y^ 



where w,x\,X2, . . .Xn,yi,y2, ■ ■ - Vm are arbitrary words. 
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6 Algebraic Structures and Other Operations 



Let /c be a positive integer and denote by iPMfe the set of all Parikh matrices of 
dimension k. The set of all Parikh matrices is denoted by 1PM. 

We define a special type of sum between Parikh matrices, denoted by 0. 

If A and B are Parikh matrices then the sum A(BB = C where C is obtained 
as the usual sum of matrices except that all elements on the main diagonal of C 
have by definition value 1. 

Theorem 9. Let B be an alphabet with card{E) = k < 2. Then, if A and B 
are Parikh matrices from IPMfc, then A® B is also a Parikh matrix. 



Proof For k = 1 the result is trivial. Assume now that k = 2 and let x 
be a preimage of A and y a preimage of B. Assume that pq = ci ,3 We define 
z = b*aPb^a'^, where t + q = \x\b + \y\b and p + r = |a;|a + \y\a and pq = ci_ 3 . One 
can verify that z is a preimage of C. 

As a consequence we obtain: 



Theorem 10. Both (TM, •, 0, 0, Ji) and (TM, •, 0, 0, /i) are semirings. 



Comment. The above results are not true if card{B) > 3 For instance if 
B = {a,b,c} and consider x = abc and y = b, then the partial sum of the 
corresponding Parikh matrices is not a Parikh matrix. 

Open Problem For the time being we don’t know under what conditions 
the partial sum of two Parikh matrices continue to be a Parikh matrix. 

The last part of this section is dedicated to closure properties of TMfc and of 
0M at certain operations. We start recalling the operation of shuffle on trajec- 
tories. This notion was defined in [9] . 

Consider the alphabet V = {r,u}. We say that r and u are versors in the 
plane: r stands for the right direction, whereas u stands for the up direction. 

Definition 4. A trajectory is an element t € V* . 

We will consider also sets T of trajectories, T CV*. 

Let B be an alphabet and let t be a (finite) trajectory, let d be a versor, 
d GV, let a, (3 be two (finite) words over B. 

Definition 5. The shuffle of a with (3 on the trajectory dt, denoted aujdt f3, is 
recursively defined as follows: 

if a = ax and f3 = by, where a,b G B and x,y G B* , then: 



axLUdt by 



a{x LUt by), if d = r, 
b{ax LUt y), if d = u. 



if a = ax and (3 = X, where a G B and x G B* , then: 



ax LUdt A 



a{x LUt A), if d = r, 
0, if d = u. 
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if a = X and (3 = by, where b G XJ and y G S* , then: 



Xujdt by 



0, if d = r, 

6(A LUty), if d = u. 



Finally, 

J A, if f = A, 

A LUt A = < 

10, otherwise. 

Comment. Note that if |a| \t\r or \j 3 \ ^ |t|„, then amt /3 = 0. 

If T is a set of trajectories, the shuffle of a with (3 on the set T of trajectories, 
denoted a ujt ( 3 , is: 

a iut (3 = a mt /?. 

teT 

The above operation is extended to languages over S, if Li,L2 C S*, then: 

Li LUt ^2 = [J aujT P- 
aeii,/3GL2 

Consider the following example. Let a and /? be the words a = 
0102030405000708, P = 6162636465 and assume that t = r^u^r^ururu. The shuffle 
of a with P on the trajectory t is: 



aiUtP = {01O2O36162O4O5O663O764O865}. 



Comment. Note that the operation of shuffle on trajectories can be extended 
for finite strings of matrices, for finite sequences of finite graphs, etc. 

For instance in the above example the letters Oi can be square matrices and 
the catenation to be the multiplication of matrices. 

Theorem 11. Both TMfc and TM are closed at the operation of shuffle on tra- 
jectories. 

Consequently we obtain that 

Theorem 12. Both TMfe and TM are closed at the operations of multiplication, 
shuffle, insertion, shuffle literal, bicatenation, etc. 

7 Conclusion 

We presented the most important results and problems concerning Parikh ma- 
trices. A problem area we have not discussed at all in this paper concerns sets of 
Parikh matrices and families of such sets, analogous to the family of semilinear 
sets of Parikh vectors. 
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Abstract. Let C be a device performing computations of a crypto- 
graphic protocol. Assume C to have limited computing power, but to 
have access to another device A with superior capacities. This setting 
could occur, for instance, with a smart card C and a mobile phone A. 
We consider the situation where C is supposed to calculate the basic 
operation of elliptic curve cryptography: the scalar multiplication of a 
point P on a curve. We investigate whether C’s performance could be 
improved by means of distributed computation; that is, whether C could 
exploit A’s computing power, without compromising the safety of the 
procedure. We set up three models of computation, varying the demand 
for C”s trust on A’s honesty. 



1 Arithmetic on Elliptic Curves 

Denote by the field with q elements. If the field is binary, then the (non- 
supersingular) elliptic curve E{Fq) over Fg is defined as the set of points 

E{Fq) = {{x, y) \y'^ + xy = x^ + ax^ -b 6} U 0 (a, 6 G F^, 6 yf 0), 

where 0 is the point at infinity. (In non-binary fields the term ax"^ must be 
replaced by ax and the condition 4a^ + 276^ yf 0 should hold). If the group 
operation is suitably defined, then E{Fg) is an Abelian group with 0 as the 
neutral element. It is customary to adopt additive notation, so we denote the 
group operation by -I- . This addition can be performed using the basic arithmetic 
operations of the base field. Below we give the equations for a binary field. Let 
P = (x, y) yf 0 be a point on the curve. Then 

-P = (a;,a: -b y) 

2P = (u, v) 

= {9“^ + 9 + a, 9{x + u) + u + y), 

where 9 = x + —. Moreover, if also Q = {x' ,y') yf 0, Q yf P and Q yf — P (i.e. 

X 

X yf x'), then 

P + Q = {u',v') 

= (0'^ + 9' + a + x + x' , 9'{x + u') -b m' -b y), 

J. Karhumaki et al. (Eds.): Theory Is Forever (Salomaa Festschrift), LNCS 3113, pp. 181-191, 2004. 
© Springer- Verlag Berlin Heidelberg 2004 
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where IT = 

X + x' 

Elliptic curve cryptosystems are public key cryptosystems [Sal]. In particu- 
lar they are El Gamal based systems [EIG], hence their security is based on the 
difficulty of the discrete logarithm problem in the group {E{Fq), -b). More specif- 
ically, EG based systems exploit the fact that given points P and kP (for some 
unknown integer k) on a curve E(Fq), it is practically impossible to calculate k. 
The operation of computing kP = P +... + P is called scalar multiplication of 
the point P . 

The users of an elliptic curve cryptosystem all share the same domain pa- 
rameters {m, f{x),a,b,G,r). Of these m and f{x) fix the underlying field and 
the presentation of field elements, a and b define the curve and G is a point on 
the curve of order r. The private key of a user is then simply an integer s < r, 
and the corresponding public key is the point Q = sG. 

For an extensive study on elliptic curves in cryptography we refer to [BSS]. 
Here, as an example, we describe the EGDSA signature algorithm, which is 
included in all EGG standards, see for example [P1363,X9.62j. Let s be H’s 
private key and Q the public key. A signature for a message m is then computed 
as follows. 

Signature generation: 

1. A computes a representative h{m) of the message, where ft- is a hash function 
agreed beforehand (SHA-1, for example). 

2. A selects a random integer ft, computes the point P = kG and converts the 
a:-coordinate of it to an integer c. 

3. A’s signature for the message m is the pair (c, d), where d = k~^{h{m) + sc) 
(mod r). 

A verifier B can check the validity of a signature (c, d) for a message m using 
A’s public key Q = sG: 

Signature verification: 

1. B computes the hash value h{m). 

2. B computes the point R = d~^ {h{m)G + cQ) and converts the x-coordinate 
to an integer c' 

3. If c = d then B accepts the signature. 

It is an easy task to verify that if both A and B follow their algorithms then 
P = R, and thus also c = c': 

R = d-\h{m)G + cQ) = d-^{h{m) + sc)G = kG = P. 

In EGDSA (and in fact in any cryptographic primitive employing elliptic 
curves), scalar multiplication is the most time consuming operation. This moti- 
vates us to concentrate on how to securely and efficiently distribute the compu- 
tation of kP. 
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2 Scalar Multiplication of a Point 

In the following we introduce our favourite algorithm for scalar multiplication: 
the Fixed Base Windowing algorithm (FEW for short), see [MOV]. Slightly mod- 
ified, we find it the most suitable for efficient and secure calculation of kP. FEW 
seems recommendable even in the cases where the base P is not fixed. The FEW 
algorithm is a constant time algorithm where all point doublings occur in the 
precomputation, and all point additions during the actual computation. This fea- 
ture diminishes the chance of a successful side-channel attack. These are attacks 
against specific implementations of some algorithm, which collect information 
by measuring e.g. time or power consumption of the device when it is processing 
some secret data. For more information on such attacks, see [Koc,KJJ]. 



2.1 The FBW Algorithm 

The Fixed Ease Windowing algorithm is a general exponentiation algorithm, as 
it can be applied in any group. As the name suggests, it is specially designed for 
situations where the base is fixed. A significant part of the computation consists 
of constructing a precomputation table T, the contents of which only depends 
on the base. If we always have the same base, then it suffices to calculate T just 
once and store it for further use. Eut, if necessary, we can also compute T each 
time separately. 

Eelow we present the basic version of FEW. We adopt the notation from 
elliptic curves setup. 

Let P = (x,y) be a point of order r on some curve if(Fg), and let fc be a 
positive integer less than r. Our task is to compute the point kP. The idea of the 
algorithm is to look at the multiplier k through a “window” of fixed width w. For 
notational convenience we assume that the length of k's binary representation 
is n = Iw, and we denote k = where 0 < ki < 2™. Denote also 

P, = 2™P (t = 0, 1, ...,?- 1); hence kP = hP,. 

Algorithm 1 

Q :=P; R :=0; B :=0; For i := 0 To 1-1 Do (* precomputation *) 
T[i] := Q; 

For j := 1 to w Do Q := 2*Q; 

For i := 2"w-l Downto 1 Do (* computation depending on k *) 

For j := 0 To 1-1 Do 

If k[j] = i Then B := B + T[j]; 

R : = R + B ; 

Return (R) 

First part of the algorithm is to generate the precomputation table T[i] = Pi. 
Then the product kP is accumulated to R such that each table element T[i] is 
included in B when there are ki additions R := R -F E left. This sums up to 
P = ^ kiPi = kP. The first part of the algorithm consists of doublings only. 
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whereas there are just additions in the second part. As already mentioned, this is 
a strength (compared to, for instance, the traditional double- and- add -algorithm) 
against side-channel attacks. 

Let us consider our model of distributed computing: when computing the 
product kP (with FEW), C takes advantage of A’s computing power. We rec- 
ommend C to use A’s assistance with the precomputation only, and the actual 
computation to be performed solely by C. There are at least two good reasons 
for this. First, if A is used in computations which depend on the coefficient k, 
then C has to ask A for something specific. That would bring on a need for du- 
plex data transfer. Second, it seems impossible to build a scheme where C would 
profit from A’s computation without having to reveal any relevant information 
on k. 



3 Models of Computation 

The main objective of this article is to analyze the benefits of distributed com- 
putation when using elliptic curve cryptography. As scalar multiplication is the 
basic operation in EC cryptography, and most of the time in any cryptographic 
EC operation is spent computing kP, it seems natural to concentrate on it. 

Let C be a device to calculate kP. Assume that k is known to C only, whereas 
P is public. Suppose that A is another device with superior computing power, 
and that A is willing to help C in the computations. For example, C could be a 
smart card inserted in a mobile phone A. 

We consider three different models. 

I A does not participate in the computation at all, so C does everything in- 
dependently. Thus, no data needs to be transferred between A and C. 

II A is used during the computation, but all data that A provides is checked 
by C. No information on k is given to A, so C doesn’t need to trust A. 

Ill A participates in the computations, and C trusts the correctness of all data 
A gives to C. If A really is honest, then the data given by C doesn’t reveal 
A any information on k. 

Model I is included to give the basis for relevant comparisons of the benefits 
of the other two models. In I and II the result of C is never incorrect, but in 
model III this is not necessarily the case if A cheats. It should also be noted, 
that we really cannot make any weaker assumptions on A than we do in model 
III. The same information that an honest A may learn is available also to an 
eavesdropper listening the traffic between A and C. Moreover, if A learns k, it 
could as well do all the calculations by itself, making C useless. 

In the following we assume that kP is computed using the Fixed Base Win- 
dowing algorithm, and that A is used only in the precomputation (if at all) . We 
also assume that |fc| = n = Iw. These choices are motivated by the following 



reasons: 
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— Based on our tests, a modification of FBW seems a recommendable choice 
for calculating kP, as it combines efficiency with suitability to avoid side- 
channel attacks. Information on different exponentiation algorithms can be 
found in [Gor,MOV]. 

— If ^ helps C during the precomputation, then A can send the required data 
without C’s request. On the other hand, if A helps C in calculations de- 
pending on k, then first C needs to request some specific data from A. As a 
consequence more data needs to be transmitted between the parties. In the 
smart-card environment data transmission is relatively slow. 

— If C’s help request depends on k, there is always a risk that some informa- 
tion on k gets revealed. In our experience it seems difficult to use A during 
computations that depend on k. 

We represent the complexity of an algorithm as a triple (s, m, i), where s, m and 
i stand for the number of squarings, multiplications and inversions of the field 
elements, respectively. For example, the complexity of both point doubling and 
ordinary point addition on elliptic curves is (1,2,1). (We ignore the costs due 
to adding field elements, as that is performed by XORing the summands and is 
extremely fast.) 

3.1 Model I 

This is the traditional model, where C needs to do all computations by itself. 
Looking at algorithm 1 it is easy to count the complexity. In the precomputation 
we need n doublings, so the complexity is (n, 2n, n) = {Iw, 2lw, Iw). In the actual 
computation we need Ck = I + — 1 additions. Thus, the total complexity is 

{liv + Ck, 2{lw + Cfc), Iw + Cfe). 

3.2 Model II 

Let us now discuss how C could utilize A’s computing power during the pre- 
computation. It seems reasonable to reduce the problem to performing a single 
doubling, since computing the table points consists of consecutive doublings. So, 
let us look at the situation where A and C need to calculate the point Q = 2P. 
We denote P = {x, y) and Q = {u, v). 

As A is not trusted, C must be able to check that every piece of information 
given by A is correct. Therefore, a precondition for the success of this model is 
to find operations such that performing them is more costly than verifying the 
results. 

In [Knu] it was shown that, under certain conditions, point halving costs less 
than point doubling. Thus, one possible idea is that A simply gives C the point 
Q = 2P, and C verifies this by checking that = P. However, this approach 
has some problems. An efficient point halving algorithm requires relatively much 
memory, which is not acceptable in our main application where C is a smart card. 
More importantly, if C does have the necessary resources for efficient point halv- 
ing, it would be best exploited by using a so-called “halve-and-add” algorithm 
(see [Knu]). 
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In the following we present two ideas, both of which are based on a simple 
observation: the result of a field inversion is easily verified by performing one field 
multiplication. Probably in any implementation of a binary field, multiplication 
costs reasonably less than an inversion. Further, let us point out that using the 
doubling formula involves computing the value 0 = x + ^. These facts together 
motivate us to suggest a scheme, where A after computing the value of 0 hands 
it over to C, who first checks its validity and then makes use of it in computing 
the coordinates of Q. 

Algorithm 2 

A^C 0 = x+§ 

C x{0 + x) = yl (* checking 0 *) 

C u = 0^ + 0 + a 
C V = 0{x + u) + u + y 

Computing the complexity of Algorithm 2 is straightforward. C needs one 
multiplication to verify that A has given the correct 0] computing u and v 
requires one squaring and a multiplication, respectively. These sum up to (1,2, 0) 
for one doubling and (?w, 2lw^ 0) for all precomputation. Altogether, we save time 
equal to that needed for Iw inversions. 

Our second algorithm for model II is based on the idea of representing a 
point as a pair {x, 0) instead of {x, y) (where 0 = a; + |). If P = (a;, y) ~ (a;, 0) 
and Q = 2P = {u, v) ~ {u, O') then, by the doubling formula: 

u = 9'^ + 0 + a 

0' = u + vu~^ = u + {0{x + u) + u + y) ■ u~^ 

= u + {0{x + u) + u + x{0 + x)) ■ u~^ 

= u + {u{9 + 1) + x^) ■ u~^ 

= 0 + u + 1 + x^u~^ 

The complexity of one doubling using this method is (1, 2, 1). Thus, there would 
be no time savings if C doubled points this way. But if C receives the value 
0' from A, then C only has to check its validity by making sure that x^ = 
u(0' + 0 + U+1). The cost of the validation operation is (1,1,0), and computing 
u requires one additional squaring. 

Algorithm 3 

C u = 0^ + 0 + a 
A^C 9' = u + vu~^ 

C x"^ =u{0' + 0 + u+l)l 

In the computation of kP we need the actual y-coordinate of every wth table 
element. Since y = x{9 + a;), this costs an additional multiplication per each of 
the I table elements. Altogether, the complexity of precomputation will then be 
{2lw, Iw + 1,0). We notice that algorithm 3 is more efficient than algorithm 2, if 
w squarings in the base field can be computed faster than w — 1 multiplications. 
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Model III 

Analyzing this model is trivial. As A is trusted, A can simply give the whole 
precomputation table for C, who only needs to do computations depending on 
k. Thus C’s total complexity in this case is (cfe,2cfe,Cfc). 

It should be noted, however, that this model is quite dangerous. In the fol- 
lowing section we consider the risks. 



4 What if A Cheats? 

In this section we consider what damage can be done if C trusts a cheating A in 
our model III above. The obvious consequence is that C’s calculations are then 
incorrect, which might be very harmful as such. But the real danger is that the 
false result might reveal some secret information. 

Assume again that C is trying to compute kP, where k is secret and P is 
public. Suppose that A gives C a precomputation table T, and C (trusting A) 
computes the point 

i-i 

Xt = Y, hT[i] 

i=0 

using the FBW algorithm. The point Xt is again public information, so also A 
learns it. From A’s point of view fc is a uniformly distributed random vari- 
able. Theoretically, the information that Xt reveals about k is I{XT,k) = 
H{Xt) — H{Xt I k), where H{Y) is the entropy of a random variable Y: 
H{Y) = ^°S 2 (p(y))- O'ii' case Xt is completely determined by k, 

thus H{Xt I fc) = 0 and I{Xt \ k) = H{Xt). 

For simplicity, suppose that k is an integer selected uniformly from all integers 
of length n = Iw: 0 < k < . 

Suppose that A has been honest, so the table T is correct. Then Xt = kP and 
it is easy to verify that H{Xt) = n. In other words, Xt completely determines k 
(as it should). The problem is that to extract the available information A needs 
to compute the discrete logarithm. 

On the other hand, assume that A has given C a table consisting of only 
one non-zero point T[i\. Then Xt = kiT[i] and H{Xt) = w. This means that 
Xt reveals w bits of information about k. Again, to extract this information A 
needs to compute the discrete logarithm. But in this case it is trivial, as there 
are only 2™ alternatives and typically w is very small. 

There is a natural generalization of this attack. A can select T to consist 
of a “small” number of non-zero points. For example, suppose that T[zq -I- z] = 
2™-T[zo] for some zq and z = 0, . . . , j— 1 and the rest T[i] = 0. Then iJ(AT) = jw 
and, if j is small enough, A can learn the secret bits ki„, . . . , ki^+j-i. The optimal 
choice for j depends on A’s computational resources. 

The attacks described above can be avoided if C checks that the table ele- 
ments T[z] are all non-zero. Unfortunately this is of little help, as we can modify 
the attack in so many ways that C cannot possibly rule out all alternatives. For 
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example, A can choose all the table points as a small multiple of some point R. 
If T[i\ = iR, then Xt = {^iki)R. It can be computed that H{Xt) ~ 11,5 if 
n = 160 and w = 4. 

From the above it follows that, unless C is absolutely certain that A is honest, 
C should avoid our model III. 

Let us finally consider a setup somewhere between models II and III. Suppose 
that, as in algorithm 2, A gives C the values 9, but C does not verify their validity 
(or alternatively verifies only some of them at random). But not even this idea 
seems too promising. If A manages to give C two incorrect 0’s in succession, 
then A has a good chance to force the table elements T[i] to any point (a,/3) 
it desires. For example, A could make T to repeat itself by forcing T[|] = T[0], 
whence T[k + 1] = T[l] and so on. In this case it is easy to compute that 
I{Xt, k) < ^ + w. In other words, the security provided by k is roughly halved. 

Let us describe how A can cheat. Denote by (p{P,6) the point computed by 
algorithm 2 without the verification of 9. Hence, if 9 is correct, ip{P, 9) = 2P. 
Given a starting point Pq = (x,y) and a target point P 2 = {a, (3), H’s task is 
to find such values 2 and t that ip{Pi,t) = P 2 , where Pi = (u,v) = ip{Po,z). 
Necessarily t should satisfy the equation + t + a = a. Such a t is easily 
computed provided that Tr(a + a) =0. It remains to select z such that the 
resulting y-coordinate will be j3. Applying the doubling formulae we obtain 

(3 = t{u + a) + a + V 
= t{^z^ z a o) cx 

z{x + z'^ + z + a)+z'^ + z + a + y 
= z^ -\~ tz -t“ (t -t“ X -t“ a -t“ l)z -t“ (t(n “t“ o) -\~ n -\~ y -\~ o). 

Thus, if A can solve z from the equation above, A’s chances for a successful 
cheating are good. A’s task is to solve a polynomial equation of third degree in 
a binary field. The number of solutions is always either 0, 1 or 3. The following 
lemma from [BRS] tells us that the equation has at least one solution with 
probability greater than 50%. In the same article an efficient method to calculate 
the solution is given. 

Lemma. The equation z^ + ciz^ + C2Z + C3 = 0 (ci G F2m) has exactly one 
solution in F 2 m, if and only if Tr( ^ Tr(l). 

C3 ”1” C2 

5 An Example 

To get a better understanding of the benefits of distributed computation we 
consider a concrete example. Currently a 1024 bit RSA modulus is considered 
to provide adequate security for most applications. The same level of security 
using elliptic curve cryptography is achieved if the order of the group E(Fq) is 
roughly 2^®°. Then the length of fc’s binary representation is (approximately) 
160 bits, and, from the point of view of FBW’s efficiency, the optimal size for 
the window (w) is 3-5. 
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Thus, assume that C is calculating a point kP on a curve over a field of size 
2^™, and that an untrusted A helps C (model II). Suppose that C uses the FEW 
algorithm with w = 4 and ? = 40. To measure the possible time savings we need 
to know the relative costs of the basic operations: squaring, multiplication and 
inversion on the underlying field. Naturally these figures depend heavily on the 
available resources. We consider two different environments. 

If (7 is a personal computer, then (according to our own implementation) one 
multiplication takes roughly twice the time required for one squaring, and an 
inversion costs approximately four multiplications. Then, in model I, the fastest 
way for C to compute the precomputation table for P is to double P repeatedly 
using the formula given in the first section. The complexity of it is (1, 2, 1), thus 
the relative cost for the whole precomputation is 160 + 2 • 320 + 8 • 160 = 2080. 

In model II we shall use algorithm 3, because squaring is cheaper than mul- 
tiplication. The complexity of one doubling is then (2, 1,0), and one additional 
multiplication is needed for the I table points. The cost for precomputation is 
then 320 -I- 2 • (160 -I- 40) = 720. This means that the time required for precom- 
putation is cut to roughly one third. 

To obtain the total saving we should take into account also the computation 
depending on k. This is performed similarly in both models requiring 55 point 
additions, and the cost of it is (55-I-2-110-I-8-55 = 715). Therefore the calculation 
time for kP is approximately halved in model II. 

If C is a smart card with hardwired field multiplication, then squaring and 
multiplication (which in this case are performed similarly) are really fast com- 
pared with inversion. In the following we assume that one inversion takes the 
same time as 40 multiplications. In this case it pays to use so-called projective 
coordinates instead of the ajjine coordinates (which we have used so far) to 
present points. This follows, since if computations are performed using projec- 
tive coordinates, then we can almost completely avoid inversions (with the cost 
of some additional squarings and multiplications). To understand what projec- 
tive coordinates are we refer to [BSS], here it suffices to present a table for the 
complexities of point doubling and point addition using different coordinates. 
(In the mixed case the other summand is given in affine, the other in projective 
coordinates.) 





complexity as (s, to, i) 


affine 


mixed 


projective 


Addition 


(1,2,1) 


(3,11,0) 


(5,15,0) 


Doubling 


(1,2,1) 


- 


(5,5,0) 



As we notice, the complexity of one doubling is now (5, 5, 0). The time required 
for precomputation is thus 5 • 160 -I- 5 • 160 = 1600. The table points are then 
given in projective coordinates, and the complexity of the remaining 55 point 
additions is (15,5,0) each. Moreover, the result needs to be converted from 
projective coordinates back to affine coordinates, which requires one inversion, 
three multiplications and one squaring. This yields to an additional cost of 15 • 
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55 + 5- 55 + 1 + 3 + 40-1 = 1144, and thus the total cost for computing kP is 
2744. 

Using model II the most efficient precomputation is performed via affine co- 
ordinates and employing algorithm 2. The complexity of one doubling is (1,2,0), 
which gives 160 + 2 • 160 = 480 for the cost of precomputation. The remaining 
55 point additions should again be done using projective coordinates. Of these 
additions 40 belong to the mixed category, the rest 15 being purely projective. 
This gives a cost of (40 • 3 + 15 • 5) + (40 • 11 + 15 • 15 + 3) + 40 • 1 = 904, as again 
the final result needs to be converted back to affine coordinates. The total cost 
is then 1384, which again is about 50% of the cost of model I. 

We summarize the results in the table below. It should be remembered that 
the times are relative, no comparison between the PC setup and smart card setup 
can be done. Also, these figures give only the maximum possible savings with our 
methods. We have not included A’s computation times and, more importantly, 
the times required for data transfer between A and C. The main application we 
have had in mind is a smart card. Unfortunately with the current technology i/o 
times to and from the card are too slow to exploit these methods. 





precomputation 


complete computation 


compl. 


cost 


sav. 


compl. 


cost 


sav. 


PC 

model I 
model II 


(160,320,160) 

(320,200,0) 


2080 

720 


65% 


(215,430,215) 

(375,310,55) 


2795 

1435 


49% 


smart card 
model I 
model II 


(800,800,0) 

(160,320,0) 


1600 

480 


70% 


(1076,1628,1) 

(356,988,1) 


2744 

1384 


50% 



6 Conclusion 

We have shown that in elliptic curve cryptography a device with restricted com- 
puting power can exploit external computational resources without jeopardizing 
the security. In theory about 50% of the computation times can be saved us- 
ing the described methods. We have also considered the dangers of trusting the 
external device. 
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Abstract. In a reputation-based trust management system an entity’s 
behaviour determines its reputation which in turn affects other entities 
interaction with it. We present a mathematical model for trust aimed 
at global computing environments which, as opposed to many tradi- 
tional trust management systems, supports the dynamics of reputation- 
based systems in the sense that trusting relationships are monitored and 
changes over time depending on the behaviour of the entities involved. 
The main contribution is the discovery that the notion of event struc- 
tures, well studied e.g. in the theory of concurrency, can faithfully model 
the important concepts of observation and outcome of interactions. In 
this setting observations are events and an outcome of an interaction is 
a maximal set of consistent events describing what happened. We also 
touch upon the problem of transferring trust or behavioural information 
between contexts, and we propose a generalised definition of morphism 
of event structures as an information-transfer function. 



1 Introduction 

In the Global Computing (GC) vision very large numbers of networked, mobile, 
computational entities interact to fulfill their respective goals. To be successful 
in such environments, entities (the terms principal, agent and entity are used 
synonymously) must collaborate, must be capable of operating under only partial 
information, and security decisions must be made autonomously, as no central 
authority is feasible. 

The classical trust management approach [1], first introduced by Blaze, Feigen- 
baum and Lacy in [2] , was proposed as a solution to the inadequacy of traditional 
security mechanisms in larger decentralised environments. Roughly, a classical 
trust management system deals with deciding the so-called compliance checking 
problem: given a request together with a set of credentials, does the request 
comply with the local security policy of the provider? The same authors also 
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developed tool-support in the form of PolicyMaker [2, 3] and later KeyNote [4] 
for handling the trust management problem. In his paper [5], Weeks displayed a 
simple mathematical framework, and showed how this framework would instan- 
tiate to various existing trust management systems, including KeyNote, SPKI 
[6] and some logic based systems (see [5] for details), sometimes even leading to 
more efficient algorithms for the compliance checking problem. The framework 
expresses a trust management system as a complete lattice {D,<) of possible 
authorisations , a set of principal names T, and a language for specifying so-called 
licenses. The lattice elements d,e £ D express the authorisations relevant for a 
particular system, e.g. access-rights, and d < e means that e authorises at least 
as much as d. An assertion is a pair a = {p, 1) consisting of a principal p G T, the 
issuer, and a monotone function I \ ^ D) ^ D, called a license. In the sim- 

plest case I could be a constant function, say do, meaning that p authorises do. 
In the general case the interpretation of a is: given that all principals authorise 
as specified in the authorisation map, m : ‘J’ ^ D, then p authorises as specified 
in l{m). This means that a license such as l{m) = m{A) V m{B) expresses a 
policy saying “give the lub of what A says and what B says”. Weeks showed 
that a collection of assertions L = {pi,h)^^j gives rise to a monotone function 
L\ : {'IP ^ D) ^ tP ^ D, with the property that a coherent authorisation map 
representing the authorisations of the involved principals is given by the least 
fixed point, Ifp L\. 

The ideas on trust management systems seeded a substantial amount of re- 
search in the area of security in large distributed systems, but as noted in [7], 
which serves as a survey on existing systems anno 2000, the current trust man- 
agement solutions do not adequately deal with the dynamic aspects of trust: a 
trusting relationship evolves over time and requires monitoring and reevaluation. 
In [8, 9] it was argued that while the idea of having mutually referring licenses 
resolved by fixed points was good, the Weeks-framework for trust would be too 
restrictive in GC environments. One reason is that principals often do not have 
sufficient information to specify precise authorisations for all other principals. In 
the framework this means that any unknown or only partially known principal is 
always assigned the bottom authorisation. The proposed solution was to have the 
set T of “authorisations” , here called trust values, equipped with two orderings, 
denoted ^ and T. Here called the trust ordering, corresponds to Weeks’ way 
of ordering by ’’more privilege”, whereas T, called the information ordering, in- 
troduces a notion of precision or information. The key idea was that the elements 
of the set should embody also various degrees of uncertainty, and then d C e 
reflects that e is more precise or contains more information than d. In the sim- 
plest of cases the trust values could be just symbolic, e.g. unknown C low ^ high, 
but they might also have more structure, as will become clear in the following 
sections. It was shown how least fixed points with respect to the information 
ordering, leads to a way of distinguishing an unknown principal from a known 
and distrusted one. 
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The notion of reputation-based systems (see e.g. [10-12]) also addresses some 
of these issues. In a reputation-based system, an agent’s past behaviour deter- 
mines together with local security policies how other agents assign privileges to 
that agent, and more generally affects any decisions concerning that agent. The 
SECURE project [13, 14] aims at providing a framework for decision-making in 
GC environments, based on the notion of trust. The formal model for trust de- 
ployed is that of [8, 9], and a particular application defines a triple (T, C, of 
trust values with the two orderings. In this model, trust exists between princi- 
pals, and so for any principals P and Q the trust that P has in Q is modelled as 
an element of T. As in the Weeks-framework this value is defined in terms of a 
license issued by P which is called P’s trust policy. Thus, at any given time the 
trust-state of the system can be described as function, m : T ^ T ^ T, where 
T is the set of principals, and the interpretation is that m{P){Q) describes P’s 
trust in Q. At any time there is a unique trust-state describing how principals 
trust, and this state is the C-least fixed point of the monotone function induced 
by the collection of all licenses. 

In SECURE, each principal P has its own decision making framework which 
is invoked when an application needs to make some decision involving another 
principal. The decision making framework contains three primary components: 
the risk engine, the trust engine, and the collaboration monitor. At the most 
abstract level, the collaboration monitor records the behaviour of principals with 
which P has interacted. This information together with a trust policy defines 
how P assigns trust values to any other principal. The trust information, in turn, 
serves as a basis for a risk analysis of any interaction. In fact, with each type 
of interaction with a principal, say Q, there is a finite set of possible outcomes 
of the interaction. The outcome that occurs is determined by the behaviour of 
Q. Each of these outcomes has an associated cost^ which could be represented 
simply as a number, but could also be a more complex object like a probability 
distribution on say K. Since the outcome depends on Q, the decision of how 
to interact is based on the trust in Q. In this set-up it is necessary that the 
trust value for Q carries enough information that estimation of the likelihood 
of each of the outcomes is possible. If this estimation is possible, one may start 
reasoning about risk, e.g. the expected cost of an interaction. The rest of this 
paper describes the model for trust deployed in SECURE applications. 



2 An Evidence Based Framework 

As we discussed in the previous section the SECURE architecture brings forward 
the need for a formal model for trust supporting the approximation of likeli- 
hood of interaction outcomes, based on previous observations. We now propose 
a framework supporting this reasoning. We will use the mathematical structures 

® The term cost should be understood more generally as cost or benefit. If costs are 
represented as non-negative numbers, one might represent benefit as negative num- 
ber. 
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known as event structures (see [15] for an original reference and the handbook 
chapter [16] for an extensive reference). 

Definition 1 (Event Structure). An event structure is a triple (E, <,#) 
consisting of a set E of events which are partially ordered by <, the necessity 
relation (or causality relation), and ff is a binary, symmetric, irrefiexive relation 
C E X E, called the conflict relation. The relations satisfy 

{e' G E \ e' < e} is finite, 

if e ff e and e < e' then e ff e” 

for all e, e', e" G E. We say that two events are independent if they are not in 
either of the two relations. 

As an example, the event structure in Figure 1 could model a small scenario 
where a principal may ask a bank for the transfer of electronic cash from its bank 
account to an electronic wallet. After making the request, the principal observes 
that the request is either rejected or granted. After a successful transaction, 
the principal could observe that the cash sent in the transaction is forged or 
perhaps run an authentication algorithm to establish that it is authentic. Also, 
the principal could observe a withdrawal from its bank account with the present 
transaction’s id, and this withdrawal may or may not be of the correct amount. 
The two basic relations on event structures have an intuitive meaning in our set 




Fig. 1. An event structure describing our example. The curly lines ~ describe the 
immediate conflict relation and pointed arrows, the causality relation 



up. An event may exclude the possibility of the occurrence of a number of other 
events. In our example the occurrence of the event ’transaction rejected’ clearly 
excludes the event ’transaction granted’. The necessity relation is also natural: 
some events are only possible when others have already occurred. In the example 
structure, ’money forged’ only makes sense in a transaction where the transfer of 
money actually did occur. Whether the e-cash is forged and whether the correct 
amount is charged are two independent observations that may be observed, in 
any order, which is modelled as independence in the event structure. 

Definition 2 (Configurations of an Event Structure). Let ES = {E, <, ff) 

be an event structure. Say that a subset of events x C E is consistent if it satisfies 
the following two properties: 
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1. Conflict free: for any e, e' € a; : e ^ e' (i.e. (e, e') ^ 

2. Necessity downwards closed: for any e € x, e' € E : e' < e ^ e' € x . 

Define the configurations of ES, written Ces, to be the set of consistent subsets 
of E. We will define to be the finite configurations. Define relation on 
Ges y. E X Ges by 



X x' e ^ a: and x' = xU {e} 

A (finite) configuration models information regarding the result of one in- 
teraction. Note that the outcomes of an action corresponds to the maximal 
configurations (ordered by inclusion) of the event structures, and knowing the 
outcome corresponds to having complete information. The configurations of our 
example is given in Figure 2. 



{g,a,c} {g.f.i} {g.a.c} 




0 



Fig. 2. Configurations of the event structure in Figure 1. The lines indicate inclusion 
and the events are abbreviated 



We can now be more precise about the role of the collaboration monitor in 
the SECURE framework. Informally, its function is to monitor the behaviour 
of principals with whom interaction is made. For a particular interaction the 
possible events that may occur are modelled by an event structure, say ES. 
The information about the outcome of this interaction is simply a configuration, 
^ ^ ^ES- 

Definition 3 (Interaction History). Let ES = {E,<,fi^) be an event struc- 
ture. Define an interaction history in ES to be a finite ordered sequence H of 
configurations, H = X\X 2 • • ■ Xn G G%g . The individual components Xi in the 
history H will be called interactions. 

An interaction history in the event structure from Figure 1 could be the sequence 
{g, a, c}{g, c}{g}{r}. The concept of interaction histories models one principal’s 
recording of previous interactions with another. When the collaboration monitor 
learns about the occurrence of an event, e, this information is increased. We 
define a simple relation expressing this operation. 
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Definition 4 (Information Relation). Let ES = {E, <, #) be an event struc- 
ture and let H = xi • • ■ Xn and K = yi • • • jjn be interaction histories in ES, 
e G E an event, and t G N, 1 < z < n be an index, then define: 



H K Xi yi and V(1 < j < n) : j i ^ Xj = yj 
and also let new be a special event new ^ E then 



Let H ^ K denote that either H ^ K or there exists e G E, i G N so that 

(e,i) 

El ^ K , and the reflexive and transitive closure of 



2.1 Evaluating Evidence 

We will equate the notion of trust values with “evidence values” . That is, values 
expressing the amount of evidence regarding a particular partial outcome (i.e. a 
configuration) . We will consider the derivation of such values based on interaction 
histories. 

Consider an event structure ES = {E, <, #). A trust value will be a function 
from Qes into a domain of evidence values. The function applied to a configu- 
ration X G Ges is then a value reflecting the evidence for x. It will be natural 
to express this evidence value as a triple of natural numbers (s,i,c) G The 
interpretation is that out of s -I- z -I- c interactions, s of these support the occur- 
rence of configuration x, c of these contradict it, and z are inconclusive about x 
in the sense that they do not support or contradict it. 

Definition 5. Let ES = (E,<,#) be an event structure and let x be a config- 
uration of ES. Define the effect of x as a function, effa, : C £;5 ^ by 

{ (1, 0, 0) if w C X 

(0, 0, 1) if x=fiw (i.e. 3e G x, e' G w : e e' ) 

(0,1,0) otherwise 

Also for (s, z, c), (s', z', c') G define (s, z, c) -I- (s', z', c') = (s -I- s', z -I- z', c -I- c'). 

The intuition behind the definition of effa, is the following. Think of a; as a 
configuration which has already been observed. We are now considering engaging 
in another interaction which will end up in some configuration. Thus, we would 
like to estimate the likelihood of ending up in a particular configuration w, given 
that the last interaction ended in x. There are exactly three cases for any fixed 
configuration w: if w C x then the fact that x occurred last time supports the 
occurrence of w. If instead x w then x contains an event which rules out the 
configuration w. Finally, if neither of these are the case, i.e. w didn’t occur but 
also wasn’t excluded, we say that x is inconclusive about w. There is a strong 
similarity between this division of configurations in three disjoint classes and 
the way Jpsang [17] derives his uncertain probabilities in the Dempster-Shafer 
framework for evidence [18]. We discuss this in the concluding section. 
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Definition 6. Let ES = (£^, <,#) he an event structure, define the function 
eval : £%/ ^ {Qes ^ N^); 

n 

eval(a;ia:2 • • • Xn) = Aw. 

i=l 

We would like to note that the functions eff and eval allow for many useful 
variations when computing trust values from interaction histories. For example, 
suppose we want to model a ’’memory” so that a principal only remembers the 
last M + 1 G N interactions. This could be done by simply taking 

n 

eval^ {xiX 2 ■ ■ ■ Xn) = Aw. ^ eflf, , (w) = eval{Xn-MXn-M+l ‘ ' Xn) 

i—n—M 

One could also imagine older interactions ’’counting less”, which could be mod- 
elled by scaling and rounding of the value of, say, the interactions older than a 
certain boundary. 

2.2 Ordering Evidence 

Given this intuition we will consider two orderings on evidence values: an infor- 
mation ordering, and an ordering expressing ’’more evidence in favour of’, which 
we call the trust ordering. 

Information order. The information ordering C of is defined as follows: 

(s, i, c) C {s' , i! , c) (s < s^) A (c < c^) A (s -I- z -I- c < -I- -I- c') 

The rationale is that {s',i',c') represents more information than (s,z,c) if 
it can be obtained from (s,z,c) by performing some additional number of 
interactions, or by refining the information about a particular interaction (or 
both). By refining we mean to change an ’’inconclusive” to ’’supporting” or 
” contradicting” . denotes the completion of by a greatest element T c . 

Trust order. The trust ordering A of is defined as: 

(s, z, c) ^ {s' , i' , c') (s < s') A (c > c') A (s -I- z -I- c < s' -I- z' -I- c') 

Here (s',z',c') expresses “more evidence in favour of” than (s,z,c) if it con- 
tains more supporting evidence, less contradicting evidence, and still at least 
as many interactions. Intuitively one can obtain (s',z',c') from (s,z,c) by 
changing contradicting evidence to inconclusive or supporting, changing in- 
conclusive to supporting, or by adding inconclusive or positive events. 

Theorem 1. The structure (N^,E) is a complete lattice. The binary join is 
given by (sq, zq, co)U(si, zi, c\) = (s, z, c) where s = maxjso, si}, c = maxjco, ci} 
and 

i = minjz gN| s-|-z-|-c> maxjso -I- zo + cq, si -I- zi -I- ci}} 
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The join with respect to T c is as expected, and the join of any infinite set is 
Tc. Furthermore, the structure is a lattice. The binary :<-join is given 

by (so, io, Co) V (si) * 1 , Cl) = (s, i, c) where s = maxjso, si}, c = minjco, ci} and 

i = min{z sN| s-|-z + c> maxjso + to + cq, si -I- zi + ci}} 

The meet is obtained dually. Finally, the join and meet functions for the trust 
order, V, A : x ^ are monotone with respect to the information order. 

In the following we use T also for the pointwise extension of CI to trust 
values, i.e. the functions Qes We can relate the relation on interaction 

histories with the information relation on trust values. 

Proposition 1. Let ES be an event structure and F[,K G interaction 

histories. Then eval is monotonic in the sense that if F[ K then also 
eval(i/) C eval(itl). 

Some information is discarded by eval, and the following proposition explains 
what is lost. The function eval is injective up to rearranging the order of inter- 
actions. 

Proposition 2. Let ES = {E,<,ff) be an event structure and H,K G be 
configurations, H = xiX 2 ---Xn and K = yiy 2 ■ ■ ■ Vm- // eval(i/) = eval(it') 
then n = m and there exists a permutation on n elements a \[n] ^ [n] so that 

H = a{K) = ya(l)ya(2) ■ ■ ■ Va(n) 

Returning to the SECURE architecture, the risk engine uses trust values to 
derive estimates on the likelihood of the various outcomes. Our trust values con- 
vey sufficient information to enable estimation of probability distributions on the 
configurations. There are several ways to do this, depending on the application. 
For example one might derive an opinion u>x = {bx,Ux,dx) for a: G C£;s in the 
sense of Jpsang, which gives rise to a probability pdf [17, 12]. 

3 Trust Policies 

As discussed in the introduction, each principal defines a local trust policy fol- 
lowing the idea from [5]. We give an example of a language for specifying such 
policies. The syntax is given in Figure 3. A policy is a list of specific policies, 
terminated by a general policy. The specific policies explicitly name a principal 
and a corresponding trust expression (r), whereas the general policy applies to 
any principal not explicitly listed. In this simple example language, the trust ex- 
pressions are built up from the basic constructs of “local reference” and “policy 
reference”, and these can then be combined with the various joins and meets 
we have available. The two types of references are similar in that both refer to 
a principal P’s trust value for a principal Q. The difference is that the local 
reference refers to P’s personal observation on Q, whereas the trust reference 
instead refers to the value that P would compute using its policy. 
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I p : r;7T 

T ::= pliocQ 

I P?<J 

I ri binop T2 



(defanlt policy) 
{p € 7, specific policies) 
(local reference to p,q £ IP U {*}) 
(policy reference to p,q £ IP U {*}) 
(binary operation binop £ {A, V, H, U}) 



Fig. 3. An example policy language 



The semantics of a policy is interpreted relative to an environment providing 
for each pair P, Q of principals a trust value which we think of as being P’s 
interaction history with Q evaluated as in the previous section. This serves as the 
data for the local references. Let obs : T ^ T ^ C£;s ^ be a fixed function 
representing this. The semantics of a policy tt is a function which takes as input 
the observation data obs, and gives as output a C-monotone function mapping 
the global current trust state (an element in GS = T ^ T ^ {Ces N^)), to 
a local trust state (an element of LS = 7’ ^ {Ces ^ N^)). We denote this as 

|7r]°'’* -.GS^LS 

The semantic function |-]°*'^ is defined by structural induction on the syntax of 
7T in Figure 4. The definitions make use of the semantic function in Figure 5, 



I* : rf'"’ = Am G GS.Xy £ J’.|[rl|°*'‘’(m)([* y]) 

fp : T ■, 7t]|°*"' = Am G GS.Xx £ 7. if {x = p) then H°*"’(m)([* p\) 

else |[7r]|°*'‘’(m)(x) 

Fig. 4. Semantics of the policy language: syntactic category tt 



{rn)(env) = ohs (env^ Y) (env^ Z) (where Y, Z £ 7 U {*}) 
|[y?Z']|°*’'’(m)(enu) = m {env"' Y) (enw^ Z) (where Y, Z £7 U {*}) 

Jri binop T 2 l|°*’'’(m)(e?w) = ^|[ril|°*"’(m)(emi) j Ibinop] ^|[r 2 l|°*’'’(m)(enu) j 

Fig. 5. Semantics of the policy language: syntactic category r 

essentially mapping the syntactic category r to an element of C£;s — > N^. This is 
again interpreted relative to observations obs and the current trust state m : GS, 
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but also relative to an environment, env : {*} ^ IP, which interprets * as a name 
in IP. The env function extends trivially to a function of type {*} U IP — > IP (the 
identity on non-* elements). The semantics of a binop is the corresponding C or 
^ lub/glb^, which is C-monotone by Theorem 1. 

We can now view a collection of mutually referring policies, 

77°*'" = {[TTpr*-" I P G IP} 

as defining a “web of trust” , and define a unique monotone function 77^*’® 

TTf" = (iTTpf*'" : P G IP) : GS” ^ GS” 
with the property that 

ProjQo7Tf" = [^Ql°*'" 

for all Q G IP. This function essentially takes a piece of global trust information 
m : GS and gives a piece of global trust information : GS which, 

when applied to p and then to q, returns p's trust in q under 7Tp, given trust as 
specified in m. Now, since the trust values Cps ^ form a complete lattice 
with the information ordering, and since is a monotone endo-function on 
this structure, it has a unique least fixed point. We define the trust information 
in a web of trust, U = {tt^ | p G IP} with local observations given by obs, as the 

least fixed point of the induced function, |P]°*’" Ifp 

The interested reader is referred to [8, 9] for examples of policies. 

4 Transferring Information 

The example policy language in the previous section allows principals to share 
trust information by means of the reference constructs. However, we were im- 
plicitly assuming that all principals agree on the event structure used. One event 
structure describes a particular context, i.e. there is one event structure for each 
possible way of interacting. It is useful to be able to map trust values between 
contexts that are somehow related, e.g. if one has only very little information 
about context ES\ but much information about a related context ES 2 , it is often 
useful to somehow apply the knowledge of ES 2 to give an estimate in P^i. We 
are aiming at formalising the kind of evidential transfer we all employ in every 
day life, where e.g. observations of an individual A’s behaviour with respect to 
timely payments of bills affects also our trust in A with respect to the question 
of whether to lend him money. We propose a definition of a morphism of event 
structures enabling such an information transfer. 

Definition 7 (Morphism of event structures). Let ES = (E,<,^) and 
ES' = {E' , <', #') be event structures. A morphism of event structure, rj : ES 
ES' is a function rj : E' ^ 2^ which has the following two properties: 

We use a strict version of the ^-lub/glb which is the T-strict extension of V,A : 
N® X N® ^ N® to a fnnction N® x N® ^ N® which is also monotone. 
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1. Monotonic: For any e',e" G E' if e' <' e" then we have 

Vc2 G ?7(e")3ei G rj{e') : ei < 62 

2. Preserves eonflict: For any e',e” G E' if e' ff' e" then 

Vei G ?y(e')Ve2 G rj{e") : ei # 62 

A morphism 77 : ES — > ES' can be thought of as a transfer of evidence from 
ES to ES' . The idea is that e G rj{e') means that an occurrence of e in ES is 
an indication of the event e' occurring in ES' . We will think of the set rj(e') as 
a disjunction of conditions in the sense that e' occurs if there is some e G rj{e') 
which has occurred in ES. If rj{e') = 0 then we say that e' has no enabling 
condition under rj. 

Definition 8 (Category of Event Structures, E). Consider the following 
eategorieal data, which we will eall the category of event structures, and denote E. 

— Objects are event structures ES = {E,<,ff) 

— Morphisms rj : ES ES' , are the morphisms of Definition 1. 

— Identities Ies ■ ES ES are the functions Ies ■ E ^ 2^ given by 

lEs(e) = {e} 

— For rj : ES ES' and e : ES' ES" eomposition, e o 77 : ES ES" is 
given by the following function e o 77 : E" 2^ 

e o 77(e") = y ig{x') 

e'Ge(e") 

Proposition 3 (E is a category). The definition o/E yields a category. 

A morphism, 77 : ES —>■ ES' can then be used to map configurations of ES 
to configurations of ES' by the mapping, fj : Ces ^ ^es' 

fj{x) = |e' G A' I 3 e G rj{e') : e G a:} 

The axioms of morphisms imply that rj{x) is a configuration, and the fact that 
E constitutes a category means that we can compose the information transfer 
functions to obtain information transfer functions. 

5 Conclusion 

We have proposed a mathematical framework for trust, and a way of deriving 
trust values from interaction histories. In this framework trust is identified with 
evidential information, arising from observed behaviour, allowing the estimation 
of likely future behaviour. The framework is deployed in the SECURE project, 
and has been used in concrete SECURE prototype applications for e.g. spam 
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filtering [14]. The trust model fits well with the bi-ordered trust structures of [8, 
9] and uses ideas from the framework of Weeks [5] . The way that trust values are 
derived from interaction histories is similar to the way that belief and plausibility 
functions are derived in [18], and the way in which Jpsang derives his “opinions” 
from belief-mass assignments in the subjective logic [17]. Event structures can 
be seen as a generalisation of the traditional frames of discernment from the 
Dempster-Shafer theory of evidence. If one allows a generalised version of event 
structures in which the conflict relation is allowed to be a subset of E x 2^ 
(where e^X means that e cannot occur if X has occurred), it is not hard to see 
that each frame of discernment 9 corresponds to an event structure with events 
{p \ p € 9}, where any event p is in conflict only with the set {q \ q € 9,q ^ p}. 
The understanding of a p is the exclusion of the state p. The configurations of the 
event structure is isomorphic to the poset (2^ \ 0, C)°p. Furthermore, a; n y = 0 
in 2® \ 0 iff the corresponding configurations are in conflict. 

While the problem of transferring trust between related contexts has been 
discussed, we still need to investigate the usefulness of our formalisation in terms 
of event structure morphisms in concrete application scenarios. The concept of 
morphisms seems to be appropriate for evidence transfer, but the exact defi- 
nition needs some further investigation. As an example, we have considered a 
generalisation of the event structure morphisms presented, in which we allow 
7j : E' ^ 2'^®s, i.e. rj is maps events to a disjunction of arbitrary finite con- 
figurations instead of only prime configurations (i.e. configurations of the form 
{e G if I e < Co} for some cq). This generalised definition gives rise to a category 
containing E as a subcategory. 
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Abstract. We survey main cryptographic features in several major wire- 
less technologies. Cellular systems GSM/GPRS and UMTS (3G) are cov- 
ered, and also shorter range systems Wireless LAN and Bluetooth. Then 
we continue by presenting problematic areas with applying cryptography 
in these wireless systems. Several examples are given in each problem 
area. 



1 Introduction 

In the first part of this paper we do a brief survey on cryptographic mechanisms 
in some of the most important wireless technologies. On cellular systems, we 
first describe security solutions in the GSM technology, the dominant global 
cellular standard. We continue by showing how the security model and security 
mechanisms were extended and enhanced in the successor of the GSM system, i.e. 
in the Universal Mobile Telecommunications System (UMTS), specified in the 
global 3rd Generation Partnership Project (3GPP). On shorter range wireless 
technologies we discuss Wireless LAN security, as standardized by IEEE, and 
also Bluetooth security that is specified by an industry consortium called the 
Bluetooth SIG (Special Interest Group). A typical use case of WLAN is access 
to Internet through a WLAN access point from distances up to several hundred 
meters while a typical Bluetooth use case is communication between two devices, 
e.g. a mobile phone and an accessory, with a distance in the order of ten meters. 

Gryptographic algorithms provide a major tool for wireless security but defin- 
ing how the algorithms are used as a part of a communication system architecture 
is a demanding task by itself. The second part of the paper contains issues with 
applying cryptography in wireless systems of global scale. Several examples of 
problems are presented. For most of these examples solutions exist also but typ- 
ically these solutions are not fully satisfactory. We study issues in the following 
problematic areas: composition of several mechanisms, continuity from legacy 
systems and equipment to more secure solutions of the future, key management 
and constraints imposed by other functionalities. 
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2 GSM Cryptography 

The essential cryptographic algorithms of GSM are explained in this section. The 
A3 algorithm has a one-way property and it is the core of a challenge-response 
protocol that is needed for authentication of users. Key generation is tightly 
linked into authentication and another algorithm with one-way property, called 
A8, is used for this purpose. The generated 64-bit key, Kc is subsequently used 
in the encryption algorithm A5 that is embedded in the physical layer of the 
GSM radio interface. To be more precise, A3, A5 and A8 are names of algorithm 
families. The GSM security architecture allows each algorithm to be replaced 
by another one that has the same input-output structure. For encryption, three 
different stream ciphers A5/1, A5/2 and A5/3 have been standardized so far in 
European Telecommunication Standards Institute (ETSI) . 

The situation is even more fragmented for A3 and A8. This is a consequence of 
the fact that these algorithms need not be standardized at all. The algorithms 
are executed only in two locations, in the SIM card inside the user terminal, 
and in the Authentication Gentre that is a database in the user’s home network. 
Therefore, each mobile network operator may in principle use its own proprietary 
algorithms. 

In the packet switched domain of the GSM system, i.e. in GPRS (General 
Packet Radio Service), the radio interface encryption is replaced by encryption 
on layer three of the radio network. This change has minor effects on the crypto- 
graphic algorithm input-output structure but there are more substantial effect in 
the sense that the protection is extended further in the network, i.e. the packet 
data traffic is encrypted from the terminal all the way to the core network. At 
the time of writing, there are three different standardized stream ciphers GEAl, 
GEA2 and GEA3 for GPRS. 

The GSM security architecture, like any wide scale security architecture, is 
a result of a trade-off between cost and security. The parts of the system that 
were seen as most vulnerable have been protected well while some less vulnerable 
parts have no protection. As already mentioned above, in circuit-switched part 
of GSM the encryption covers only the radio interface between the terminal and 
the base station. It is also worth noticing that the terminal does not execute 
any explicit authentication of the network, thus leaving the terminal vulnerable 
against certain types of active attacks. In particular, this refers to a malicious 
party who has the required equipment to masquerade as a legitimate network 
element and/or legitimate user terminal. 

The GSM security architecture has also been criticized for keeping some 
essential parts secret, e.g. specifications of the cryptographic algorithms. This 
secrecy does not create trust on algorithms in the long run because they are not 
publicly available for analysis with most recent methods. Also, protection based 
on global secrets is not efficient because these secrets tend to be revealed sooner 
or later. 
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3 UMTS Cryptography 

The UMTS technology can be seen as a successor of GSM. Indeed, the core 
network part of UMTS is an enhanced version of the GSM core network. On the 
other hand, there has been more revolutionary development in the radio network 
part. The evolution vs. revolution aspect is reflected in the security features. The 
UMTS authentication and key agreement mechanisms are executed between the 
terminal and the core network; these mechanisms were created by enhancing 
GSM-type challenge-response user authentication protocol with a network au- 
thentication based on sequence numbers. 

In the radio network there is more revolution in security features. Encryption 
is provided by /8 algorithm on the radio layer two, and as a completely new 
feature, integrity protection algorithm /9 is applied to signaling messages on 
the radio layer three. Also in UMTS, the symbols /8 and /9 refer to algorithm 
families; only the input-output structure is fixed, not the internal structure of 
the algorithms. Gurrently, one version of both algorithms has been standardized, 
both based on the publicly specified KASUMI block cipher. At the time of 
writing, the 3GPP has begun the process of specifying another pair of algorithms. 
While it is not seen probable that KASUMI algorithm would be broken in the 
near future, an alternative would clearly increase the overall security level of 
UMTS. 

In the following subsections we briefly go through all the essential UMTS 
security feastures. See [5] for further reading. 

3.1 Mutual Authentication 

Three entities are involved in the authentication mechanism of the UMTS sys- 
tem: home network, serving network (SN) and terminal, or more specifically 
Universal Subscriber Identity Module (USIM, typically in a smart card). The 
SN checks subscribers identity (as in GSM) by a challenge-response technique 
while, on the other hand, the terminal checks that SN has been authorised by 
the home network to do so. As explained earlier, the latter feature is new in 
UMTS (when compared to GSM). 

The basis for the authentication mechanism is a master key K that is shared 
between the USIM and the home network. This is a permanent secret with the 
length of 128 bits. The key K is never transferred out from the two locations. 
In particular, the user has no way of getting to know her/his master key. A 
key agreement procedure is inseparably linked to the mutual authentication. It 
provides keys for encryption and integrity protection. These are temporary keys 
with the same length of 128 bits. Every time the USIM is authenticated, new 
keys are derived from the permanent key K . 

3.2 Radio Encryption 

Once the user and the network have authenticated each other they may begin 
secure communication. As described earlier, a cipher key CK is shared between 
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the core network and the terminal after a successful authentication. Before en- 
cryption can begin, the communicating parties have to agree on the encryption 
algorithm also. On user side, encryption/decryption takes place in the termi- 
nal (not in USIM), and on network side, the Radio Network Controller (RNC) 
handles encryption/decryption. This means that the cipher key CK has to be 
transferred from core network to the radio access network, and inside terminal 
the CK is transferred over the USIM-terminal interface. 

The UMTS encryption mechanism is based on a stream cipher concept be- 
cause it has an inherent advantage that the mask data can be generated before 
the actual plaintext is known. Then the final encryption is a very fast bit oper- 
ation. 



3.3 Integrity Protection 

The purpose of the integrity protection is to authenticate individual control 
messages. The integrity protection is also used between the terminal and RNC, 
just like encryption. The integrity key IK is generated during the authentication 
and key agreement procedure, again similarly as the cipher key. 

Output of the integrity protection mechanism is a message authentication 
code that consists of 32 bits. 

3.4 Network Domain Security 

The term ’’network domain security” refers to protection of communication be- 
tween different 3GPP networks (and network elements). A basic notion on the 
IPsec-based security in 3GPP systems is the security gateway. All control plane 
IP communication towards external networks should go via security gateways. 
These gateways use the Internet Key Exchange (IKE) protocol to exchange IPsec 
Security Associations between themselves. The protection method for the data 
traffic is IPsec Encapsulated Security Payload (ESP). 



3.5 SIP Security 

The 3GPP has specified an IP Multimedia Subsystem (IMS) that is a core net- 
work subsystem using Session Initiation Protocol (SIP) for session management 
and call control. The actual user data traffic (voice, video etc.) is carried over IP 
and in principle IMS may be built on top of any access technology that supports 
IP connectivity. 

When a User Agent (UA) in the terminal wants to get access to IMS, it 
typically first connects to the 3GPP radio network. In this process, UMTS se- 
curity features (described earlier) are utilized: mutual authentication, integrity 
protection and encryption on the hop between the terminal and RNC. Through 
the core network, the UA is able to contact IMS nodes using SIP signaling. The 
first contact in IMS is SIP Proxy, called P-CSCF (Proxy Call Session Control 
Function). Through it the UA is able to register itself to home IMS. At the same 
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time UA and home IMS authenticate each other, based on permanent shared 
master secret. They also agree on temporary keys. 

The SIP traffic between visited IMS and home IMS is protected by network 
domain security mechanisms. The security associations used for this purpose are 
not specific to the UA in question. 

Next UA and the P-CSCF negotiate in a secure manner all parameters of 
the security mechanisms to be used to protect further SIP signaling, e.g. crypto- 
graphic algorithms. Finally integrity protection of first hop SIP signaling between 
UA and P-CSCF is started, using IPsec ESP. 

3.6 Recent Developments 

At the time of writing, 3GPP is finalizing security mechanisms on following 
areas: 

-Wireless LAN interworking with 3GPP systems; 

- Multimedia Broadcast/Multicast Service; 

- Generic Authentication architecture; 

- Presence, Messaging, Conferencing services; 

- Authentication framework for network domain security: this adds support 
of a Public Key Infrastructure (PKI) for key management. 



4 Bluetooth Cryptography 

The Bluetooth technology includes authentication and key agreement between 
two peer devices where the cryptoalgorithm SAFER-I — h is in use in an appropri- 
ate mode. In Bluetooth, unlike in cellular systems, the authentication algorithm 
has to be standardized because it is executed in terminal devices. The keys used 
in authentication are agreed first by a pairing procedure in which a link key 
is generated. For radio interface confidentiality, Bluetooth uses a stream cipher 
tailor-made for this purpose. 

It allows fast and simple hardware implementation because it uses linear 
feedback shift registers (LFSR) as building blocks. The algorithm has strong 
correlation properties and LFSRs are initialized for each data packet to be en- 
crypted. It is possible to encrypt both point-to-point and point-to- multipoint 
connections. 

The Bluetooth link layer security is based on the strength of the link key. That 
is dependent of the Bluetooth PIN or it is fetched directly from the application 
layer. Link layer security mechanisms, i.e authentication and encryption, can 
be activated directly or from the application. Anonymity protection has to be 
provided by separate means if needed. Positioning attack is possible because 
identification of the Bluetooth devices is based on permanent addresses. On the 
other hand, this hardly constitutes a severe threat in most practical situations, 
since the coverage of Bluetooth radio is small. 
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5 WLAN Cryptography 

The IEEE 802.11 group has specified the Wireless Local Area Network (WLAN) 
technology. During that process, also security mechanisms for WLAN were de- 
veloped, called Wired Equivalent Privacy (WEP). The naming already indicates 
that the goal was the same as in GSM, i.e. to provide security level comparable 
to that of wired networks. Unfortunately, the original design of WEP has sev- 
eral weaknesses. For example, the RC4 cipher is used with short initialization 
values, key management is weak in many implementations and the system lacks 
integrity protection and replay protection. At the time of writing, the IEEE 
802.11 is finalizing a completely new set of security mechanisms, including new 
cryptoalgorithms. 

An industry consortium Wi-Fi Alliance has already endorsed an intermedi- 
ate set of enhanced set of security mechanisms. This set is called WPA (Wi-Fi 
Protected Access) and it is also implemented in many products. 

Both WPA and the complete security specification (IEEE 802. Hi) are based 
on an authentication and key management framework (IEEE 802. IX). This in- 
troduces possibility to use EAP (Extensible Authentication Protocol) [4]. For 
instance, password-based or public key based authentication can be used in EAP 
and, thus, also in WPA and in 802. Hi. WPA includes also improved usage of RC4 
by TKIP (Temporal Key Integrity Protocol). Furthermore, WPA adds message 
integrity and replay protection. 

The complete IEEE 802. Hi adds adequate replacements for all features of 
WEP. For instance, RC4 is replaced by AES. Wider range of network configura- 
tions is supported, as well as concepts of personal area networks, roaming and 
handovers. 



6 Composition of Mechanisms 

In this section and in the remainder of the paper, we study several issues that 
have arised when cryptographic techniques are introduced to wireless systems. 
These examples of issues try to give an overall picture about the diversity of 
issues. 

The first category consists of issues with composition of security mechanisms. 
Indeed, it is well known phenomenon in security area that two secure mechanisms 
can be combined in a way that looks reasonable at first sight but yet it turns 
out that the composition provides no security at all. 



6.1 Tunneled Authentication 

As already briefly mentioned in previous section. Extensible Authentication Pro- 
tocol (EAP) is a general protocol framework that supports multiple authentica- 
tion mechanisms. It allows a back-end server to implement the actual mechanism 
while the authenticator element simply passes authentication signaling through. 
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EAP consists of several Request/Response pairs where Requests are always sent 
by the network. 

We now provide some analysis of the problem. We have an inner protocol 
that is, for instance, a legacy GSM authentication protocol based on SIM card 
built into EAP. Then there is an outer protocol, typically in the form of a TLS 
tunnel. Note here that the inner legacy protocol is usually also in use without any 
tunnelling, as is the case for GSM. If we continue now with this GSM example, 
a man-in-the-middle can set up a false cellular base station to ask terminal for 
responses to challenges. 

Even in the case where EAP protocol would be used exclusively in tunnelled 
mode, authentication of the TLS tunnel relies solely upon terminal actions. There 
is a weak point here because the terminal user may easily accept an unknown 
certificate. This kind of dependency on user’s actions is typically not accept- 
able to network operators. Now the session keys are derived from TLS Master 
Key generated using tunnel protocol (this is the same key as used to create the 
tunnel before the inner authentication can begin). The result is that the keys 
potentially derived in the EAP protocol (e.g., the Kc in the case of GSM authen- 
tication) are not used for the tunnel anyhow. Therefore, in the best case, where 
the inner protocol cannot be used without the tunnel, the security depends on 
user’s judgment on certificates, and on the worse case, where the inner protocol 
can be used also without the tunnel (e.g. GSM case), there is no way of knowing 
whether the end-point of the tunnel is actually man-in-the middle instead of the 
legitimate end-point authenticated in the inner protocol. 

The main lesson to be learnt from this is not only that there is another case 
where composing two secure protocols may result in an insecure protocol. It is 
important to note that using tunnelling to ’’improve” a remote authentication 
protocol is very common approach. Known vulnerable combinations include at 
least HTTP Digest authentication and TLS, PEAP and any EAP subtype, PIG 
and any EAP subtype. 

There are solutions that can be used to fix the problem but usually the exact 
fix needs to be tailored to the specific protocols. A typical solution could be to 
create a cryptographic binding between tunneling protocol and the authentica- 
tion protocol by, for instance, using a one-way function to compute session keys 
from tunnel secrets (e.g. TLS master key) and EAP secrets (e.g. IK,CK). See 
[1] for further details on this issue. 

6.2 Using IPsec for IMS Message Authentication 

As explained in an earlier section, IPsec ESP is used in 3GPP system for SIP 
message authentication. This implies that the identity to be authenticated is the 
IP address. On the other hand, charging is based on user’s identity at SIP level. 
This constitutes the first problem in this example. Another problem stems from 
the fact that SIP level identity is authenticated for registrations and keys are 
derived at the same time for the IPsec ESP. These keys should be taken into use 
immediately but how does the IP layer know that its old Security Association is 
not valid anymore ? 
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These problems are not at all unsolvable. But the most straight-forward 
solution that has been created for the first problem, i.e. binding of different 
layer identities, could also be claimed to be a layer violation and these typically 
cause unnecessary restrictions for the system architecture. The second problem 
was solved for 3GPP systems by introducing special handling of port numbers 
which somebody could call as a misuse of port number semantics. 

6.3 Barkan-Biham A5/2 Attack 

This recent attack, see [2], exploits weaknesses in GSM cryptographic algorithms. 
In particular, A5/2 can be broken fast in ’’ciphertext only” model. A further 
attack exploits also other legacy features in the GSM security system: A5/2 is 
a mandatory feature in terminals, call integrity is only based on encryption and 
the same Kc can in principle be used in different algorithms. 

An example attack goes as follows. This allows the attacker to decrypt a 
strongly encrypted call without using a brute-force method. First the attacker 
passively catches the challenge RAND from the radio interface, does not care 
about the response S RES, and records the corresponding call encrypted with Kc 
and A5/3. Then the attack turns active. The attacker replays the stored RAND 
towards the victim and tells the victim to use the weaker algorithm A5/2. Now 
the attacker is able to find Kc based on the received encrypted uplink signal. 
Gonsequently, the earlier recorded call can also be decrypted by the attacker. 

A proposed countermeasure in 3GPP (that is not yet accepted at the time 
of writing) would create an amendment to the GSM security architecture. It 
uses the fact that the random challenge RAND is the only variable information 
sent from home network to the terminal in the authentication. Now we may 
divide the space of all 128-bit RAND values into different classes with respect 
to which encryption algorithm is allowed to be used with the Kc derived from 
this particular RAND. A 32-bit flag could indicate to the terminal that this kind 
of special RAND is in use, and 16 bits would further be used to indicate which 
algorithms out of total 8 GSM and 8 GPRS encryption algorithms are allowed to 
be used with the key derived from this special RAND. These parameter lengths 
would imply that the effective length of RAND is reduced from 128 bits to 
80 bits. Fortunately, this reduction would not cause any decrease in the overall 
security level. 

7 Continuity 

In this section we study issues that are related to an important practical re- 
quirement of continuity. We have to make sure that new systems are backward 
compatible with legacy devices and equipment. There is a two-fold effect. From 
one hand, handling of legacy easily introduces security holes into the new sys- 
tem. On the other hand, new systems are also legacy of the future, and therefore, 
potential future requirements should be taken into account as well in the design. 
Of course, finding the right balance between this kind of future-proofing and 
current needs is a tricky task. 
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7.1 3G-WLAN Interworking with EAP-SIM 

The abbreviation EAP-SIM refers to an Internet draft that describes how GSM 
authentication and key agreement protocol can be done in EAP, see [3] . In addi- 
tion, this mechanism enhances GSM Authentication and key agreement protocol 
with mutual entity authentication based on the derived key Kc- This is done by 
utilizing a bundle of (at least two) GSM triplets {RAND, SRES, Kc) in one run 
of the entity authentication. Therefore, the network authentication is based on 
(at least) 128-bit secret. 

WLAN interworking in 3GPP follows the basic idea of connecting WLAN 
access zone to the cellular core network. There are several levels of interworking. 
For instance, we may have shared subscriber database, shared charging and 
authentication, or even shared services. 

Gellular access (also for 3GPP radio network) is possible with either SIM or 
USIM, and therefore, the same should hold for WLAN access as well. This creates 
the following problem: enhancements (e.g. mutual authentication) in EAP-SIM 
fall down if an active GSM attack is possible against the terminal in the cellular 
side. In particular, an attacker may mount a divide-and-conquer attack against 
the bundle of triplets by breaking each Kc separately. 

The problem can easily be avoided if the same physical SIM cannot be used 
for both cellular and WLAN domains but then we lose part of the interworking 
benefits. 



7.2 Phased Introduction of Security 

This example case of a legacy issue is related to the introduction of security 
gateways in network-to-network communications, see earlier section on network 
domain security. The problem is, however, not restricted to this context. The 
starting point is that communication between networks works well without this 
additional security measure. 

The first problem can be illustrated by the following simplified calculation. 
Assume 10 % of networks have been upgraded to support security gateways. 
But then only about 1 % of the total communication volume is protected. This 
problem actually applies to any added feature, not only to security features. 

The second problem is, instead, specific to security. Assume now that 99 % 
of networks have been upgraded to support security gateways. Then about 98 
% of total communication volume is protected. But certainly an active attacker 
masquerades as one of the remaining 1 % of networks. 



7.3 Bluetooth Initialization 

Our last example of continuity issues deals with future-proofing. 

The original motivation of Bluetooth radio technology was to ’’replace wires” . 
It makes sense to assume that initial introduction between two devices owned 
by the same person (e.g. a mobile phone and a headset) occurs in a relatively 
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secure environment (e.g. at home or at the office). In such an environment, an 
active man-in-the-middle attack is not very probable. 

Later many use cases were invented for Bluetooth and suddenly there was 
a new requirement: to establish secure connections with foreign devices as well, 
e.g. your mobile phone and a Bluetooth device in a ticket booth. It is not rea- 
sonable anymore to assume there are no active men-in-the-middle in this kind 
of environment. 

Lesson to be learnt from this case is that it is always good to leave some safety 
margin. This is true not only in the quantitative dimension (e.g. key lengths) 
but also qualitatively, i.e. in the plurality of security functionalities. 



8 Key Management 

It is a well-known fact in practical cryptography that management of keys is 
usually a tricky issue. 



8.1 Change of Keys: IMS First Hop Protection 

In SIP registration there is a possibility for entity authentication (in 3GPP 
system). At the same time new keys are derived (both at terminal side and at 
network). For optimal use of resources, we would like to minimize the number 
of simultaneously valid keys per user. Clearly we have to allow two keys during 
the change process: the current key (set) and the new key (set). 

Now we have a problem. Registration message from the user triggers the 
creation of new keys on the network side. What if an attacker sends a fake 
registration message while the change of keys is ongoing ? Should we ignore 
this message ? Both answers have unpleasant consequences: ignoring is bad if 
the message is not fake after all but, on the other hand, accepting the message 
necessarily increases the number of concurrent keys to be stored. 



8.2 PKI Issues 

Public Key Infrastructure is an area where many issues are well documented in 
the litterature, see e.g. [6] (see also [7] for fundamentals of public key technol- 
ogy). In this paper we just list a few interesting issues with PKI and certificates: 
How to deliver new root certificates into terminals that are already on the field 
? How to introduce client certificates into legacy systems ? Can we utilize exist- 
ing authentication and authorization infrastructure ? How to define certificates 
applicable to future services ? 



8.3 Digital Rights Management Issues 

This is another area that has been under extensive study during recent years. A 
few points of specific interest in DRM are listed in the following: 
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- User may act as an attacker against his own device. 

- Use of global secrets implies unwanted ’’break one break all” phenomenon. 

- There are difficult backward compatibility issues in DRM. 

- With download applications it is OK to get the key (and rights) to the 
content afterwards but with streaming applications the key is necessarily needed 
in advance. 

9 Constraints from Outside 

We briefly mention a couple of problems on this area. End-to-end protection is 
problematic because of, at least, the following reasons: addressing and routing, 
middle proxies, lawful interception, key management. 

Meeting lately introduced requirements is always difficult (e.g. Bluetooth 
initialization) but it is still good that security is taken into account already in 
early phases of system design, hence an iterative process should be used. 

10 Conclusions 

Some general conclusions can be crystallized: 

- Cryptography is a major tool in making wireless systems secure; 

- It is nontrivial to apply general-purpose security tools in a new context; 

- We cannot ignore massive legacy systems; 

- Incremental security enhancements lead to complex solutions; 

- Reasonable safety margins can be justified also in deciding which security 
features to implement. 



References 

1. Asokan, N., Niemi, V. and Nyberg K., Man-in- the-middle in tunneled authentica- 
tion protocols, Proceedings of 11th Cambridge Workshop on Security Protocols, 
Springer Lecture Notes in Computer Science (to appear). 

2. Barkan, E., Biham, E. and Keller, N., Instant Ciphertext- Only Cryptoanalysis of 
GSM Encrypted Communication, Proceedings of CRYPTO 2003, Springer Lecture 
Notes in Computer Science (2003). 

3. draft-haverinen-pppext-eap-sim-12, October 2003: ”EAP SIM Authentication” 
(work in progress). 

4. Blunk, L. and Vollbrecht, J., PPP Extensible Authentication Protocol (EAP), In- 
ternet Engineering Task Force, Request for Comments (RFC) 2284. 

5. Niemi, V. and Nyberg, K., UMTS security, John Wiley & sons (2003). 

6. Gutmann, P., PKI: It’s Not Dead, Just Resting, IEEE Computer 35(8), pp. 41-49, 
August 2002. 

7. Salomaa, A., Public-Key Cryptography, Second Edition, Springer (1996). 




On a Tomographic Equivalence Between 
(0,1)-Matrices* 



Maurice Nivat 

Universite Denis Diderot-Case 7014, 2, place Jussieu, F-75251 Paris Cedex 05 

mnivatSwanadoo . f r 



Abstract. The tomographic problems studied here are associated to 
reconstructing a matrix when only some local information is given. We 
investigate a problem of discrete tomography via i?-null matrices and 
prove a result similar to Ryser’s Theorem. 



1 Introduction 

A very important basic fact occurring often in the present paper is as follows: 
every tiling of the plane by translations of a given m x n rectangle is invariant 
by one translation. This invariant translation is either the horizontal translation 
of length m or the vertical translation of length n (or both in the particular case 
of a regular tiling). 

Moreover, the following is true: assume that one point of the tiling rectangle 
is marked in some way and each tile in the tiling contains a marked point in 
the same position. If we then look at the tiled plane through a rectangular 
mxn window, exactly one marked point is seen regardless of the position of the 
window, see Figure 1. 

This last fact is not characteristic of rectangles, but is valid also for all pieces 
P with which one can tile the plane by translation as shown in Figure 1. In fact, 
a theorem can be stated as: 

Theorem 1. Let U be a mapping from 1? to {0, 1} and P he a mapping from a 
finite subset F of I? to {0, 1} such that card{f G F \ P{f) = !} = !. Then the 
two following assertions are equivalent: 

(1) yze Z2, card{f GF\U{z + f) = l} = l 

( 2 ) Z2 = 11-1(1)0 F’, 

where the symbol 0 denotes the unambiguous Minkowski sum: C = A(B B if and 
only if: 



( Vc G FI 3 a G A, b G U, c — a 0 6, 

( Vai, a2 G A; 5i, 62 G B; ai 0 61 = a2 0 62 ai = 0,2 and b\ = 62. 
* Dedicated to Arto Salomaa for his 70th birthday. 
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Fig. 1. Part of a tiling which is invariant by a vertical translation showing two positions 
of the window in which appears exactly one 1 



The property (1) says that IX contains exactly one 1 in each position of the 
window F and property (2) says that 1? is tiled by translation of F , even more 
precisely that, if we surround each 1 in IX by a copy of F such that the symbol 1 
is always in the same position in F , we obtain a tiling of Z^. The notions above 
have been generalized in [5] to obtain the notion of a homogeneous bidimensional 
sequence. 

Definition 1. Mapping U:l? ^ {0, 1} is homogeneous of degree k with respect 
to a finite window F if and only if 

VzGZ^ card{f &F\U{z + f) = l} = k. 

In [5] we proved a rather surprising result. 

Theorem 2. Mapping \i : 7? ^ {0, 1} is homogeneous of degree k with respect 
to a rectangle R if and only of there exist k disjoint homogeneous sequences of 
degree 1 (with respect to the same R) such that: 

IX = IXi + 1 X 2 + . . . + life 

This last result can be nicely rephrased: 

If XX : ^ {0)1} is homogeneous of degree k with respect to F then one 

can color the I’s with k colors in such a way that in each position of the window 
there appears one and only one 1 of each color. 
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Example 1. In Figure 2, the three sequences corresponding to a, c and d are in- 
variant by the translation (4, 0) and the fourth one corresponding to b is invariant 
by the translation (0, 3). 
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Fig. 2. 



We suspect that Theorem 2 is valid for all exact windows F. By an exact 
window we mean a window F such that one can tile the plane by translation 
of F. Anyway there exists a sequence IX which is homogeneous of degree 1 with 
respect to F if and only if F is exact (by Theorem 1) and thus Theorem 2 can 
hold only for exact windows. For the time being we are unable to prove Theorem 
2 in this more general case. 

2 A Decomposition Theorem for Homogeneous Matrices 
with Integer Coefficients 

We shall deal with matrices rather than sequences. A matrix M of size p x q is 
a mapping from 

into a set of coefficients which will be either {0, 1} or {—1, 0, 1} or N or Z. We 
use the notation [p] to denote the set [p] = {0,...,p— 1}. The definition of 
homogeneity is now slightly changed. 

Definition 2. Let R be the rectangle [m] x [n]. The matrix M : [p]x [g] ^ Z is 
homogeneous of degree k with respect to R if and only if: 

V(x, y) G [p — TO -I- 1] X [q — n + 1] 

^{M{x + i,y + j) I (i,j) G [to] x [n]} = k 

Remark that if the set of coefficients is {0,1} this definition coincides with the 
previous definition 1. 
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The matrix R{M) of size (p—m+l) x (g— n+1) with coefficients R{M){x, y) = 
^{(a; + i,y + j) \ (i,j) G [m] x [n]} is called the R-projection of a matrix M 
with coefficients in Z. 

A matrix is homogeneous with respect to R if and only if its i?-projection is 
constant. We call an matrix R-null if and only if its i?-projection is the matrix 
0, whose coefficients are all 0. 

Theorem 3. A matrix M with coefficients in N, the set of non negative integers, 
is homogeneous of degree k with respect to R if and only if it is the sum of k 
matrices Mi, M 2 , ■■■, M^ with coefficients {0,1} which are homogeneous of 
degree 1. 

The proof is very similar to the proof of theorem 2 given in [5] but slightly 
more difficult. 

Let M and M' be two matrices of the same size p x g with coefficients in N. 
We say that M' is smaller than M if and only if for all {i,j) G [p] x [g]: 

M'{i,j) < M{i,j). 

In order to prove theorem 3 we show that if M is homogeneous of degree k 
with respect to R, there exists a matrix M' with coefficients in (0, 1} which is 
homogeneous of degree 1 with respect to R and smaller than M . Then we can 
subtract M' from M to obtain M — M' whose coefficients are in N, and obviously 
M — M' is homogeneous of degree k — 1. 

Clearly we can repeat this process and eventually write M as a sum of (0, 1)- 
matrices which are homogeneous of degree 1. 

Now we need a crucial lemma. 

Lemma 1. If M with coefficients in Z is homogeneous with respect to R, then 
for all (x, y) G [p] x [g] satisfying {x + m,y + n) G [p] x [g] one has 

M{x,y) + M{x + m,y + n) = M{x + m,y) + M{x,y + n). 

Proof. Let s = '^{M{x + i,y) \ 1 < i < m — 1} and s' = '^{M{x + i,y + n) \ 
1 < t < TO — 1}. We clearly have 

M{x,y) + s = M{x,y + n) + s' 



and 

s + M {x + m, y) = s' + M (x + m, y + n) , 
which imply the equality 

s — s' = M{x,y + n) — M{x,y) = M{x + m,y + n) — M{x + m,y). 



□ 
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(x+m,y+n) 



(x+m,y) 



Fig. 3. 



Figure 3 helps in visualizing the property of Lemma 1. 

We just expressed the fact that the sum of the coefficients in the rectangle 
whose left inferior corner is (x, y) is equal to the sum of the coefficients in the 
rectangle whose left superior corner is (x, y + n). 

In a similar way, the equality of the sums of the coefficients in the 2 rectangles 
whose right inferior corner is (x + m, y) and the right superior corner is (x + 
m,y + n). 

Lemma 1 can be easily extended to get: 

Lemma 2. For all (x,y) G [p] x [g] and all a, (3 G Z the following holds: if M 
satisfies the conditions of Lemma 1 and (x + am,y + (3n) belongs to [p] x [cj\, 
one has 



M {x, y) + M{x + am, y + (3n) = M{x,y + (3n) + M{x + am, y) 

Proof. The proof is immediate by symmetry and induction. □ 

Proof (The proof of Theorem 3). Assume first that M is invariant by the trans- 
lation (m, 0). 

Due to the invariance, for all {x, y) G [p] x [g], /? S Z we have that if x + (3m G 
[p], then M(x + am, y) = M{x, y). Then one can find easily a (0, l)-matrix which 
is homogeneous of degree 1 with respect to R and smaller than M in the following 
way: 

Take any non-regular coefficients M(x,y); (x,y) G [m] x [n] and set xq = x. 
Now for all [3 such that y + (3n G [q] then exists a strictly positive coefficient 
M {x(3, y + (3n) with xj3 G [m] . This is obvious since 

^{M(x , y + f3n) I x G [m]} = ^{M{x,y) I a; G [m]} 

and '^{M{x,y) \ x G [m]} > M{x,y) > 0. 

Now for each a, (3 such that {xf3 + am, y + (3n) G [p] x [g], all the coefficients 
M{x(3 + am, y + (3n) are strictly positive and the (0, l)-matrix M' given by 



M'{u, v) 



1, if (u, v) = {x(3 + am, y + f3n), 
0, otherwise. 
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is smaller than M and can be subtracted from M. The matrix M — M' is 
homogeneous of degree k — 1 and is also invariant by the translation (m, 0). As 
mentioned in the beginning of the proof, it follows that a homogeneous matrix 
having coefficients in N which is invariant by (m, 0), can be expressed as a sum 
of (0, l)-matrices which are homogeneous of degree 1 and invariant by (m, 0). 

Assume now M is not invariant by the translation (m, 0). Then there exists 
(x,y) such that M{x,y) and M{x + m,y) are different. We can assume that 
M{x,y) > M{x + m,y) (the argument in case M{x,y) < M{x + m,y) is exactly 
the same). 

We can show that for all j3 such that y + (3n G [g], M{x,y + (in) is strictly 
positive. This is an immediate consequence of Lemma 2, since 

M{x,y + (in) + M{x + m,y) = M{x,y) + M{x + m,y + (in) 
implies that 

M{x,y + (in) — M{x + ni,y + (in) = M{x,y) + M{x + m,y) 
is strictly positive. 
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Fig. 4. 



Let us set yo = y. Consider any column x + am, x + am € [p] and compare 
the sum 

'^{M{x,y-h + j) I j G [n]} 

of n consecutive coefficients in the column x containing M{x, y) for some h < n 
with the sum 

Y,{M{x + am, y — h + j) \ j € [n]} 

The sums are equal. Then two cases are possible: 

— yj G [n] M{x + am, y — h + j)) = M{x,y — h + j). This implies, by Lemma 
2, that M{x + am,j) = M{x,j) for all j G [g]. 





222 



Maurice Nivat 






1 


1 




1 






1 


































1 






2 






2 










2 






1 






1 
































1 






2 






2 










2 






1 






1 





























Fig. 6. 



We can then take Ua = y and be sure that 

M{x + am, ya + (3n) 

is strictly positive for all j3 such that ya + (3n € [g] . 

— There exists j such that M{x, am, y — h + j) > M{x,y — h + j). 

Then if we set ya = y — h + j we are sure by an argument used above that 
M{x + am, ya + (in) is strictly positive for all fi such that ya + Pn £ [g]. 

Now the matrix M' defined as 

M'(„ i") = (a; + am, y + Pn), 

’ 0, otherwise. 

is a (0, l)-matrix which is homogeneous of degree 1, invariant by the translation 
(0,n) and smaller than M. 

This completes the proof. □ 
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Example 2. Let m = n = 3. The matrix in Figure 4 homogeneous of degree 
5. The circled element can only belong to a homogeneous matrix of degree 1 
invariant by the horizontal translation. We easily find one (we have the choice 
between 2) and subtract it from M to obtain matrix in Figure 5. The matrix in 
Figure 5 is homogeneous of degree 4. 

The circled element can only belong to a ’’vertical” homogeneous submatrix. 
We find one and delete it to obtain matrix of Figure 6. It is homogeneous of 
degree 3. This last matrix can be decomposed in only one sum of 2 ’’horizontal” 
and one ’’vertical” homogeneous of degree 1. 

We can now easily prove a theorem of decomposition for homogeneous ma- 
trices with coefficients in Z and i?-null matrices. 

Theorem 4. A matrix with coefficients in Z is homogeneous of degree k with 
respect to R if and only if it is a difference of two sums of homogeneous (0, 1)- 
matrices which are homogeneous of degree 1 with respect to R. 

The number of elements in these two sums can he bounded by: 

k + ^{M{x,y) I M{x,y) < Q,{x,y) G [p] x [g]} 

and 

k + '^{M{x,y) I M{x,y) > Q,{x,y) G [p] x [g]} 

Proof. Let M he & homogeneous matrix with coefficients in Z. Consider a neg- 
ative coefficient M{x,y) = —a, a > 0. 

Take any (0, l)-matrix M' which is homogeneous of degree 1 and satisfies 
M'{x, y) = 1. M + aM' is a matrix with coefficients in Z which is homogeneous 
of degree k + a and satisfies: 

(M + aM'){i,j) > M{i,j) for all (z,j) G [p] x [g]. 

Moreover, M + aM' has at least one negative coefficient less than M. 

Repeating this process until all the negative coefficients disappear we can 
write M as the difference Mi — M 2 where Mi and M 2 have non-negative coeffi- 
cients, are homogeneous. Mi of degree k + ^{—M{x,y) \ M{x,y) < h (x,y) G 
[p] X M}. 

It may be more economical to have all the positive coefficient disappear if 
the sum of the positive coefficients is less than the absolute value of the sum of 
the negative coefficients. 

When M has been written as the difference Mi— M 2 we obtain the theorem by 
decomposing Mi and M 2 into sums of homogeneous (0, l)-matrices of degree 1. 

□ 



3 i?-null Matrices 

Clearly if M and M' have the same R projection then M-M' is i?-null. 

Studying i?-null matrices is a natural way to study the equivalence between 
matrices defined by the equality of their i?-projection. 
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We can see the problem of constructing a (0, l)-matrix with a given R- 
projection as a problem of discrete tomography: we are given a family of local 
pieces of information on a set of pixels distributed in a rectangle and the problem 
is to retrieve this information. 

The first problem of discrete tomography appearing in the literature is the 
problem of constructing a (0, l)-matrix with given row sums and column sums. 
Solutions to this problem were given by Ryser and, independently by Gale. 

Let tq, , Tp-i and cq, . . . , Cg_i be two sequences of non negative integers 
such that 

I i G [p]} = ^{cj I J G [q]}. 

Can one find a (0, l)-matrix of size p x q such that 

Vz G [p] we have ^{M(z, j) | j G [g]} = 

and 

Vj G [g] we have ^{M(z, j) | z G [p]} = Cj. 

The Ti’s are called row sums (in more recent literature the vector {tq, ... , Vp-i) 
is called the horizontal projection) and the Cj’s are called columns sums (the 
vector {co, . . . ,Cq-i) is the vertical projection). What interests us here is the 
study of the equivalence defined by: 

M is equivalent to M' if and only if M and M' have the same horizontal and 
vertical projection (Ryser). 

An elementary Ryser transformation amounts to exchange in a matrix two 
I’s in position (x,y), {x + h,y + 1) with two O’s in position (x+h,y) and (x,y+l), 
see Figure 7. 
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Fig. 7. 



Clearly such a transformation leaves the two projections invariant. The nice 
result of Ryser is that if M and M' are equivalent then one can transform M 
into M' by a sequence of elementary transformations. 

The matrix in Figure 8 is obtained by performing two sequences Ryser ele- 
mentary transformations. 

Introducing matrices with coefficients in {—1, 0, 1} we can state Ryser’s result 
as follows. 
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Theorem 5 (Ryser’s theorem). Every matrix with coefficients in {—1,0,1} 
whose horizontal and vertical projections are the constant vector equal to 0 is a 
sum of matrices of the form: 

M'{x,y) = 0 but for {x,y) G |(a;o,yo), (a^i.yi), {xo,yi), (a;i,?/o)} for some 
xq, xi, yo, yi such that xq x\ and yo y\ for which one has 

M'{xo,yo) = M'{xi,yi) = 1 



and 

M'{xo,yi) = M'{xi,yo) = -1- 

Here we can prove a very similar theorem. Let us say that a row {M(x,y) \ 
y G [g]} of matrix M is m-null if and only if the sum of m consecutive entries of 
that row is always 0. We define n-null columns in the same way. 

Note that a m-null row is invariant by the translation (m, 0). Obviously 

ai + Q2 + ■ ■ ■ + am = 02 + as + . . . + Qm + Om+l imply Oi = Om+l- 

Adding a m-null row or a n-null column to a given matrix M obviously 
does not change the i?-projection. Moreover, if M and M' have the same R- 
projections then one can obtain M' from M by adding to M & number of m-null 
rows and n- nulls columns. 

We can now state the following theorem. 

Theorem 6. The set of all matrices whose all entries are 0 ’s except for a m- 
null row or a n-null column, is a generating subset of the vector space of R-null 
matrices. 

Remark 1. Note that the set of m-null rows and n-null columns is not a basis of 
the vector space for they are not linearly independent as proved by the example 
in Figure 9. 



Proof. We first give a very easy one. 
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Fig. 9. 



In the decomposition of a i?-null matrix in a sum of homogeneous (0, 1) or 
(0, —1) -matrices of degree 1 or —1 it is clear that the number of (0, l)-matrices 
will be the same as the number of (— 1, 0)-matrices. Thus M is a sum 

M = Mi + . . . + Mk 

of matrices of the form H — H' , where both H and H' are (0, l)-matrices homo- 
geneous of degree 1. 

We then prove that each matrix H — H' is a, sum of m-null rows and n-null 
columns. Consider first the case H and H' are both invariant by the translation 
(to, 0). The figures illustrate the proof. 
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Fig. 10. 
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We can make all the (nonzero) rows start with a 1 or a -1 in the first column 
by adding m-null rows. To the matrix of Figure 10 we add the set of m-null rows 
shown in Figure 11 to obtain the sum in Figure 12 which is obviously composed 
of n-null columns. 

If the rows containing the I’s and -I’s are the same, then obviously all the 
rows are m-null. 

Now consider H — H' where H is invariant by (m, 0) and H' is invariant by 
(0,n), see Figure 13. Note that the two O’s which appear come from a 1 and a 
— 1 occurring in the same position. 




Fig. 11. 



By adding m-null rows we can have all the rows containing I’s start in the 
first column (Figure 14). As a result, we get Figure 15. 

Now it suffices to have all the columns containing —I’s contain —I’s in the 
rows containing I’s, we can achieve that by adding n-null columns to obtain a 
matrix which has only m- nulls rows. □ 

Since addition is commutative and a matrix whose rows (resp. columns) are 
all m-null (resp. n-null) is invariant by the translation (m, 0) (resp (0,n)) we 
can state: 

Corollary 1. Every R-null matrix M is the sum of Mi and M 2 where M\ is 
(m X l)-null and M 2 is (1 x n)-null. 

4 An Alternate Proof of Theorem 6 

Consider an m x n-null matrix M of size p x q. We can add to M & matrix M\ 
with m-null rows in order that M + M\ has only O’s in its leftmost column. 
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Fig. 14. 




Fig. 15. 



The first column of M+Mi is full of O’s, and this implies that all the columns 
of rank (3m, /3 G N, of M + Mi are m-null columns. Since (M + Mi)(i, 0) = 0, 
for all i, 

^{(M + Mi){i + h,j + k)\h€ [n], l<k< TO — 1} = 0 
and, since ^{(M + Mi)(i + h,j + l + k) \ h € [n],k € [to]} = 0, we have for all i 
^{(M + Mi){i + h,m) \ h e [n]| = 0. 
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The sum of n consecutive coefficients in the column of rank m is equal to 0 and 
obviously this is also true for all the columns of rank /3m, /? G N. 

Let Ml be the matrix whose columns are full of O’s but for the column of 
rank f3m which is the opposite of the n-column of rank Pm of M + Mi . Then in 
M + Ml + Ml all the columns of rank Pm are full of O’s. 

We can repeat the process and find M2 whose rows are m-null such that in 
M + Ml + Ml + M2 the column 1 is full of O’s, and we can keep the columns of 
rank /3m full of O’s. 

Then the columns of rank /3m + 1, /3 > 1, are, by the same argument as 
above n-null columns. Whence we can find M2 whose columns are n-null and 
such that in M + Mi + Mi + M2 + M2 the columns of rank /3m and /3m + 1 are 
full of O’s. 

Eventually we can write M as a sum of a matrix with m-null rows and a 
matrix with n-null columns. 

Example 3. Let M be 4 x 3-null as in Figure 16. One can see that the columns 
4 and 8 of M -|- Mi are 3-null. In Figure 17, the column 5 of the last matrix is 
3-null. In the last matrix of Figure 18, the two remaining columns are 3-null. 
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Fig . 16. 



Eventually we have 

= ( — All — M^2 — M3) -t- ( — Adi — AI 2 — Adpj 

where the first matrix has only 4-null rows and the second has only 3-null 
columns. 
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Fig. 19. 



Remark 2. We can describe a basis of the vector space of p x g-null matrices. 

We take all the matrices that have only one row which is not full of O’s and 
this row is of the form 

(10 ... 10 ...)(10 ... 10 ...)... 

with I’s in position /3m and /3m + h for some h between 1 and to — 1. 

There are (to — l)g such matrices, to — 1 for each of the q rows. We take all 
the matrices which have only one column which is not full of O’s and this column 
is of the form 

(...)^(0 1 ... 0 1)^(... 0 1 ... 0 1)^ 

with I’s in positions an and an + k for some k between 1 and n — 1. 

There are (n— l)p such matrices, n — 1 for each of the p columns. The set of 
these matrices certainly generates the whole vector space. 

But we can express the last p — 1 columns as linear combination of the other 
columns and of the rows, see Figure 19, and we have a decomposition as in 
Figure 20. 

The dimension of the vector space is then 

(n - l)p + (to - l)q -{p-l){q- 1) 

and this is compatible with the fact that if we know the elements in the {p — 1) 
first columns and the (g — 1) first rows of a p x g-null matrix, then we know the 
matrix. 
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Abstract. In the last time, several efforts were made in order to remove 
the polarization of membranes from P systems with active membranes; 
the present paper is a contribution in this respect. In order to compensate 
the loss of power represented by avoiding polarizations, we introduce 
tables of rules: each membrane has associated several sets of rules, one of 
which is non-deterministically chosen in each computation step. Three 
universality results for tabled P systems are given, trying to use rules of 
as few as possible types. Then, we consider tables with obligatory rules 
- rules which must be applied at least once when the table is applied. 
Systems which use tables with at most one obligatory rule are proven to 
be able to solve SAT problem in linear time. Several open problems are 
also formulated. 



1 Introduction 

In membrane computing, the P systems with active membranes have a special 
place, because of the fact that they provide biologically inspired means to solve 
computationally hard problems: by using the possibility to divide membranes, 
one can create an exponential working space in a linear time, which can then 
be used in a parallel computation for solving, e.g., NP-complete problems in 
polynomial or even linear time. Details can be found in [7], [8], as well as in the 
comprehensive page from the web address http : //psystems .disco .unimib. it. 

One of the important ingredients of P systems with active membranes is the 
polarization of membranes: besides a label, each membrane also has an “elec- 
trical charge”, one of + (positive), — (negative), 0 (neutral). These electrical 
charges correspond only remotely to biological facts; by sending ions outside, 
cells and cell compartments can get polarizations, but this is not a very com- 
mon phenomenon. Starting from this observation and also as a mathematical 
challenge, in the last time several efforts were made to avoid using polarizations. 

However, the question seems not to be a simple one, and the best result 
obtained so far was to reduce the number of “electrical charges” to two; this is 
achieved in [1], where both the universality and the possibility of solving SAT 
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in linear time are proven for P systems with active membranes and only two 
polarizations. When completely removing the polarizations, similar results are 
obtained (see [2], [3]) only by compensating the loss of power (of “programming” 
possibilities) by using additional ingredients, such as the possibility of changing 
the labels of membranes, the division of non-elementary membranes, etc. 

The present paper goes into the same direction of research: we get rid of po- 
larizations and we “pay” this by structuring the sets of rules associated with each 
membrane by considering tables of rules, like in Lindenmayer systems. Specifi- 
cally, several sets of rules are associated with each membrane, and in each step of 
a computation we non-deterministically choose one of these sets, and its rules are 
used in the maximally parallel manner. The use of tables can have a biological 
motivation, in the same way as the tables from L systems theory have a biologi- 
cal origin: the change of environmental conditions (for instance, of seasons) can 
select specific evolution rules for different times (different seasons). 

The use of tables proves to be helpful in what concerns the computing power: 
we get universality for systems of a rather reduced forms, with only a few types 
of rules used, and without polarizations. 

An important problem remains unsolved: can tables compensate polarizations 
also in what concerns the possibility to solve hard problems in polynomial time? 
A possible negative answer to this problem would be a very nice finding: in 
view of the result from [1], it would follow that passing from one polarization 
(all membranes neutral) to two polarizations makes possible the step from the 
complexity class P to NP. 

If, however, we add a further ingredient - at the first sight not very powerful 
- to tabled P systems, namely designating in each table obligatory rules, which 
should be used at least once when applying the table, then we can solve SAT in 
linear time. The construction uses at most one obligatory rule in each table. 

2 P Systems with Active Membranes 

We assume the reader to be familiar with basic elements of membrane computing, 
e.g., from [7], but, for the sake of completeness, we recall here the definition of 
the class of P systems we work with, those with active membranes (and electrical 
charges). 

Such a system is a construct 

n = (0,pt,Wi,...,Wm,R), 



where: 

1. m > 1 (the initial degree of the system); 

2. O is the alphabet of objects; 

3. /i is a membrane structure, consisting of m membranes, labeled in a one-to- 
one manner with elements of = {1, 2, . . . , m}; 

4. wi,. . . ,Wm are strings over O, describing the multisets of objects placed in 
the TO regions of /i; 
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5. i? is a finite set of developmental rules, of the following forms: 

(a) [a^v]l, 

for h G H,e G {+, — , 0}, a G 0,v G O* 

(object evolution rules, associated with membranes and depending on 
the label and the charge of the membranes, but not directly involving 
the membranes, in the sense that the membranes are neither taking part 
in the application of these rules nor are they modified by them); 

(b) a[ 

for h G H, Cl, 62 e {+, — , 0}, a,b G O 

{in communication rules; an object is introduced in the membrane, pos- 
sibly modified during this process; also the polarization of the membrane 
can be modified, but not its label); 

w r^b, 

for h G H, 6 i, 62 G {-h, — , 0}, a,b G O 

{out communication rules; an object is sent out of the membrane, possi- 
bly modified during this process; also the polarization of the membrane 
can be modified, but not its label); 

id) [a]l^b, 

for h G H,e G {-h, — , 0}, a,b G O 

(dissolving rules; in reaction with an object, a membrane can be dis- 
solved, while the object specified in the rule can be modified); 

(e) [aV^^[b ]-[6 ]:^ 

for h G H, 6 i, 62 , 63 G {-I-, — , 0}, a,b,cG O 

(division rules for elementary membranes; in reaction with an object, the 
membrane is divided into two membranes with the same label, possibly 
of different polarizations; the object specified in the rule is replaced in 
the two new membranes by possibly new objects). 

We have omitted the rules for dividing non-elementary membranes, usually iden- 
tified as being “of type (/)”. 

It is worth noting that the rules of all types are non-cooperative (and that 
there are no further ingredients involved, such as a priority relation, for con- 
trolling the spplication of rules) . In the customary definition of P systems with 
active membranes, the initial membranes of p, are not necessarily labeled in a 
one-to-one manner, but there is no loss of generality in the assumption that the 
labels are unique: we can relabel the membranes with the same label and then 
duplicate the necessary rules. Moreover, because in what follows we only con- 
sider that by membrane division we obtain membranes with the same label, the 
labels present in the system are always from the set { 1 , 2 ,..., m} present at the 
beginning (maybe some of them used several times, because of the division of 
membranes) . Therefore, the set H of labels is specified by p, it can be omitted 
when specifying the system. 

The rules of type (a) are applied in the parallel way (all objects which can 
evolve by such a rule should do it), while the rules of types ( 6 ), (c), (d), (e) are 
used sequentially, in the sense that one membrane can be used by at most one 
rule of these types at a time. In total, the rules are used in the non-deterministic 
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maximally parallel manner: all objects and all membranes which can evolve, 
should evolve. Only halting computations give a result, and the result is the 
number of objects expelled into the environment during the computation; the 
set of numbers computed in this way by the various halting computations in U 
is denoted by N{II). 

By NOPm,n,p{pol3, a, b, c, d, e) we denote the family of sets N{II) computed 
as sketched above by systems starting with at most m membranes, using mem- 
branes of at most n types, at most p membranes being simultaneously present, 
and using all types of rules; when rules of a certain type are not used the corre- 
sponding letter a, 6, c, d, e will be missing. Also, when membrane division rules 
are not used, we will specify only the number of membranes in the initial con- 
figuration (hence, only m) as a subscript of NOP. The parameter pols indicates 
the fact that one uses three polarizations. 

Further details can be found in [7] - including the proof of the following result. 
(We denote by REG, CF, CS, RE the families of regular, context-free, context- 
sensitive, and of recursively enumerable languages. In general, for a family EL of 
languages, NFL denotes the family of length sets of languages in EL. Therefore, 
NRE is the family of Turing computable sets of natural numbers.) 

Theorem 1 . NOP3{pol3,a,b,c) = NRE. 

The number of polarizations were decreased to two in [1]; with the previous 
notations, the result can be written as: 

Theorem 2 . NOP2{pol2,a,c) = NRE. 

Note that the result from Theorem 1 was improved at the same time in the 
number of polarizations, the number of membranes, and the number of types of 
rules used. 

In [3] and [2] rules of types (a) — (e) without polarizations were considered. 
Because “no polarization” means “neutral polarization” , we add the subscript 0 
to the previous letters identifying the five types (oq) — (eo) of rules. 

The power of polarizationless P systems with active membranes is not pre- 
cisely known, but it was shown in [2] that they are able to compute at least the 
Parikh images of languages generated by matrix grammars without appearance 
checking. 

Because the notion of a matrix grammar will be also used below, we introduce 
it here in its general form. 

A matrix grammar (with appearance checking) is a construct G = (TV, T, S, M, 
F), where TV and T are disjoint alphabets. S' S TV, TIV is a finite set of sequences 
of the form {Ai — > a;i, . . . , ^ Xn), n > 1, of context-free rules over TV U T 

(with Ai G N,Xi G (TV U T)*, in all cases), and F is a set of occurrences of rules 
in M (TV is the nonterminal alphabet, T is the terminal alphabet, S is the axiom, 
while the elements of M are called matrices). 

For w,z G (TV U T)* we write w z if there is a matrix {A\ x\, 

. . . ,An Xn) in M and the strings Wi G {N \J T)*, 1 < i < n + 1 , such that 
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w = w\,z = Wn+l^ and, for all 1 < f < n, either Wi = w[Aiw'l ,Wi+\ = w[xiw" , 
for some w[,w" G {N U T)*, or Wi = Wi+i, Ai does not appear in Wi, and the 
rule Ai Xi appears in F . (The rules of a matrix are applied in order, possibly 
skipping the rules in F if they cannot be applied ~ therefore we say that these 
rules are applied in the appearance checking mode.) 

The language generated by G is defined by L{G) = {w G T* | S' w}. 
The family of languages of this form is denoted by MATac- If the set F is empty, 
then the grammar is said to be without appearance checking. 

It is known that CF C MAT C MATac = RE and NREG = NGF = 
NMAT C NGS (for instance, the one-letter languages in MAT are known to 
be regular, [6]). 

A matrix grammar G = {N, T, S, M, F) is said to be in the binary normal 
form A N = NiU N 2 U {S, #}, with these three sets mutually disjoint, and the 
matrices in M are in one of the following forms: 

1. (S ^ XA), with X gNi,A€ N 2 , 

2. (A ^ Y,A^x), with A,y G Ai,A G N 2 ,x€ (A 2 U T)Ma;| < 2, 

3. (A ^ #), with A,y G iVi,A G A 2 , 

4. (A ^ A, A ^ x), with A G Ni,A G N 2 , and x G T*, |a;| < 2. 

Moreover, there is only one matrix of type 1 (that is why one uses to write it 
in the form {S —>■ AqAo), in order to fix the symbols A, A present in it), and 
F consists exactly of all rules A ^ appearing in matrices of type 3; # is a 
trap-symbol, because once introduced, it is never removed. A matrix of type 4 
is used only once, in the last step of a derivation. 

For each matrix grammar there is an equivalent matrix grammar in the binary 
normal form. Details can be found in [4] and in [11]. 



3 Tables of Rules 

In the “standard” P systems with active membranes there is specified only one 
set of rules; because the membranes are present in the rules, we precisely know 
where each rule is to be applied. A possible generalization is to consider several 
sets of rules - for uniformity with L systems, we call them tables - such that in 
each step of a computation a table is used, non-deterministically chosen (the rules 
of the selected table are applied in the maximally parallel manner, as mentioned 
in the previous section). 

This case corresponds to having global tables; a more relaxed variant is to 
consider local tables, sets of rules associated with each membrane. 

Specifically, for each membrane i we can consider sets . . . , Ri,ki of rules, 
for some > 1, all of the rules from sets Rij, ^ F j F ki, involving membrane 
i. In a step of a computation, we apply the rules from one of the tables associ- 
ated with each membrane, as usual, in the maximally parallel non-deterministic 
manner with respect to the chosen table. 

If we are allowed to “evolve” a region by means of a table for which no rule 
is actually applied, then the local tables can be combined in global tables, hence 
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in this case the local version is weaker than the global one. However, there is no 
difference from the computational point of view (at least in the cases investigated 
in the next section): systems with local tables (and restricted types of rules) are 
equivalent with Turing machines; moreover, the proofs are based on systems 
with one or two membranes, with the “main work” of two-membranes systems 
done in the inner membrane, hence choosing tables which change nothing in one 
of the regions do not change the generated set of numbers. 

In what follows we will consider only local tables, that is why we choose a 
more restricted - also, more natural - definition of a transition step: if there are 
tables by which a region can effectively evolve (at least a rule of these tables 
can be effectively applied), then one of these tables must be chosen. Otherwise 
stated, we cannot choose a table with no applicable rule if there are tables with 
applicable rules. This restriction both corresponds to the notions of parallelism 
and synchronization, basic in membrane computing, and it is also useful in the 
proofs below. 

In systems with tables (either local or global) we have two levels of non- 
determinism: in each step we first non-deterministically choose one table (in 
the local case, associated with each membrane), and then we use the rules of 
the chosen table in a non-deterministic manner (observing the restriction of 
maximal parallelism for the chosen table) . The standard definition of P systems 
corresponds to the case where we have only one table (at the level of the whole 
system) . 

The fact that we use (local) tables is indicated by adding tab to the notations 
from the previous section. 

We do not know whether the number of tables associated with membranes 
matters (that is, whether it induces an infinite hierarchy of the computed sets 
of numbers) or normal form theorems like that known for ETOL systems (two 
tables are enough, see [9]) are true also in our case. In view of this open problem 
it could be better to indicate also the maximal number of tables used, writing 
tabs for using at most s tables, but we do not deal with this aspect here. 

The usefulness of using tables is intuitively obvious, because by clustering 
the rules in “teams of rules” we can control in a more precise way the work of 
the system. This is illustrated also by the following simple example: consider 
the system 



n = {{a,b}, [ 


] a, .Ri.i, .Ri,2, Ri,^) 


Ri,i = {[a ^ aa 




Ri,2 = {[ ^ ^ ] 




.Ri,3 = {[ & ] 1 ^ 


a). 



After using n > 0 times the first table (thus producing 2” copies of a), we can 
end the computation by using once the second table, and then 2" times the third 
one. Consequently, N{U) = {2” | n > 1} G NOPi{tab, ao, cq) , a set of numbers 
which is not in NMAT. 
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4 Universality Results 

The usefulness of tables is illustrated also by the results below: the computational 
universality is obtained without polarizations for various reduced combinations 
of types of rules. 

The first result uses rules of the first three types (hence neither membrane 
dissolution nor membrane division operations) . 

Theorem 3. NOP 2 {tab,ao,bo,co) = NRE. 

Proof. We only (have to) prove the inclusion D, and to this aim we use the 
equality NRE = MATac- Let us consider a matrix grammar with appearance 
checking G = {N,{a}, S,M,E) in the binary normal form, hence with N = 
W U 7 V 2 U {S', #} and with matrices of the four types mentioned in Section 
2. All matrices of M are supposed to be labeled in an injective manner with 
mi A "E i ^ n (hence i uniquely identifies the matrix). Each terminal matrix 
{X ^ X, A ^ x) is replaced with {X ^ f, A ^ x), where / is a new symbol 
(the label of the matrix remains unchanged) . 

We construct the tabled P system with active membranes, II, with the com- 
ponents: 

O = fVi U A ^2 U {Zi, Z[, (f) I 1 < f < n} U (a, a! , e, /, #}, 

M [ [ ] 2 ] 1’ 
wi = A, 

W 2 = XqAqc, where (S ^ XqAq) is the initial matrix of G, 
and the following tables (by U we denote the set Ni U {Zi, Z[, (f) | 1 < z < n}). 



1 . For each matrix rrii : (X 
tables 


^Y,A 


x) in M of types 2 or 4, we consider the 


II 


Zi ] 2 , 


[ ^ ]2 ^ [ ]2(*)> [ e la ^ #} 


U {[ a - 


-#]2l 


oc G 


to 

II 






U {[ a - 


-#]2l 


oc G 


Rh = {{zi 




[ (z) ^ a;F U 


U {[ a - 


-#]2l 


a G U}. 


2. For each matrix rm : {X 




-^- #) in M of type 3, we consider the table 


R2,i 


= {[A 






U {[ a - 


-^ # l2 1 a G U}. 



3. We also consider the following tables: 



i?2J = {[/^A]J 

U |[a^#]2 loGC/UlVa}, 
R 2 ,a = {[ a (2 ^ [ ] 2 ®^ [ # ^ # 12}’ 

^i = {[a']i^[ liO, [#^#]i}- 
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We have the equality N{II) = {n | a” S L{G)}. Indeed, we start with the 
multiset XqAqc in the central membrane; assume that we have here a multiset 
Xwe for some X G Ni and w G (./V 2 U{a})*. There is only one table for membrane 

1, sending out a copy of a (provided that there are copies of a' in the skin 

region), and using the trap-rule # ^ # provided that the object # is present; 
in this latter case, the computation will never stop. If applied in membrane 2 
when Xwe is here, the table i? 2 ,/ will introduce the trap-object #, and this 
happens also if we use any table of the forms Thus, we can apply 

only a table of type i? 2 ,i for rrii a matrix of M. That matrix should be either 
of the form m-i : (X ^ V, A ^ x) (of type 2 or of type 4), or of the form 
rrii : (X ^ V, A ^ #) (of type 3): if the first rule of the matrix is a ^ /3 with 
a X, then the trap-object is introduced. 

The case of a matrix of type 3 is simpler: if A is present, then the trap-object 
is introduced, and the computation will never stop (because of the table i? 2 ,a, 
which can be used forever). If A is not present, then we just change X into Y. 
Thus, the simulation of the matrix rrii of type 3 is correct. 

If we choose to simulate a matrix of types 2 or 4, then it must have the 
second rule of the form A ^ x, for A as specified by the table R 2 ,i- if the rule 
[ A (2 ^ [ ] 2 (f) is not used, thus “keeping busy” the membrane, then the rule 
[ e ] 2 ^ # must be used, and the computation will never stop (table R\ can be 
applied forever). 

In the next step we have to continue the simulation of the matrix rm by 
using the corresponding table R '2 j. This is the only table which can be applied 
without introducing the trap-object In this way, (i) comes back to membrane 

2, and Zi is replaced by Z'. In the next step, again only one table can be used 
without introducing the trap-object, namely i? 2 i- erases the object Z[ and 
replaces {i) with xY , thus completing the simulation of the matrix. 

At any moment, if any object a is present in membrane 2, then table i? 2 ,a 
can be used and a is sent out (first transformed into a' in the skin region). 

The system is returned to a configuration with the contents of membrane 
2 as in the beginning, hence the process can be iterated. When the object / 
is introduced, no table i?2,i, .f? 2 ,i) ^ 2 ,i can be used. By means of i? 2 ,/ we check 
whether any symbol from N 2 is present, hence whether the derivation in G is 
terminal. The computation in U ends by sending out all copies of a, hence N{II) 
equals the length set of the language L{G). □ 

In the previous proof, the role of rules of type (bo), (cg) (besides sending the 
result outside the system) was to ensure that only one object A is replaced by x, 
thus correctly simulating the second rule of a matrix {X ^ Y, A ^ x) of types 
2 or 4. This can be done also by using rules of type (eg)- 

Theorem 4. NOP 2 , 2 , 3 {tab,ag,cg,eg) = NRE. 

Proof. As above, we consider a matrix grammar with appearance checking G = 
{N, {a}, S, M, F) in the binary normal form, with the matrices of M labeled in an 
injective manner with rm,! i "£ n, and each terminal matrix (X — > A, A ^ a;) 
replaced with (X —ff,A—^x), where / is a new symbol. 
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We now construct the tabled P system with active membranes II, with the 
components: 

O = 7Vi U IV 2 U {Zi, (z) I 1 < z < n} U {a, a, d, e, f, #}, 

M [ [ ] 2 ] 1’ 
wi = A, 

W2 = XqAqc, where (S' ^ XqAq) is the initial matrix of G, 
and the following tables (by U we denote the set fVi U (z) | 1 < z < n}). 

1 . For each matrix rrii : {X ^ Y, A ^ x) in M of types 2 or 4, we consider the 
tables 

R2,^ = {[X^Z,]^, [A]^^[{{)]^[d]^, 

[ e (2 ^ [ # l 2 [ # 12} 

U{[a^#]2|aGl7U {a}}, 

R'2,. = {[Z,^X]^, [l^)^xY]^, [d^#]J 

U{[a— >#] 2 |o;GCfU {a}}. 

2. For each matrix rrii : {X Y, A ^ in M oi type 3, we consider the table 

R2,i = {[X^Y]^, [d-#]2} 

{[a ^ \ a & U \J {a}}. 

3. We also consider the following tables: 

i?2J = {[/-A]2, [d-#]2l 
{[a ^ \ a & U\J N 2 }, 

R2,d = {[d]^^ d, [a]2^[ ( 20 ', [#^#]2}> 

Si = {[a']i^[ ]^a, 

The equality N{II) = {n | a” G L{G)} follows in a similar way as in the 
previous proof, this time with the interplay of rules [ Al ] 2 ^ [ (z) ] 2 [ <^ ] 2 
[^\ 2 ^[^^ 2^^\2 ensuring that the second rule of each matrix of type 2 or 
4 is correctly simulated (used exactly once): if the second rule is used, then the 
computation never stops, hence [A]^^[{i)]^[d]^ must be used. In this way, 
membrane 2 is divided. In the first copy of the membrane we have the object 
(z), which will complete the simulation of the matrix. In the second copy of 
the membrane, the one containing the object d, we cannot use any table which 
contains the rule d —>■ hence the only continuation is by using the table i? 2 ,d- 
This dissolves the membrane, and its objects, left free in the skin region, will no 
longer evolve. The matrices of type 3 are again simulated in only one step of a 
computation in II. All copies of object a are immediately sent out of membrane 
2 (to prevent their duplication when dividing the membrane), and from the skin 
region are sent out of the system. We leave the details to the reader and conclude 
that the system correctly simulates the matrix grammar G. □ 
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One of the difficulties in the previous proofs was to inhibit the parallelism of 
using the rules of type (oo). In membrane computing, the usual way to do this is 
by using catalysts, distinguished objects which never evolve, but can enter rules 
of the form ca ^ cv, where a is a single object, which evolves under the control 
of the catalyst c. This idea can be considered also for P systems with active 
membranes, allowing rules of type (oo) of the form [ ca —>■ cv]^, where c is a 
catalyst, a is an object and v a multiset of objects. (When specifying a system 
with catalysts, the set C of catalysts is explicitly given after the set of objects.) 
We indicate the use of catalysts by writing catr in the notation for families of 
numbers computed by systems of a given type as above; r indicates the fact that 
at most r catalysts are used. 

The previous results have the following counterpart for the catalytic case - 
with only two types of rules being used, and with only one membrane (note that 
one catalyst suffices). 

Theorem 5. NOPi{tab,cati,ao,co) = NRE. 

Proof. We consider again a matrix grammar with appearance checking G = 
(N, {a}, S, M, F) in the binary normal form, with each terminal matrix (X —>■ 
\,A^x) replaced with {X ^ f,A ^ x), where / is a new symbol, and we 
construct the tabled P system with catalysts 7T, with the components: 

O = fVi U 7 V 2 U {a, c, d, /, #}, 

C={c}, 

M=[ li. 

wi = XgAod, where {S XqAo) is the initial matrix of G, 
and the following tables. 

1 . For each matrix rrii : {X ^ Y, A ^ x) in M of types 2 or 4, we consider the 
table 

Rl,^ = {[ X ]^, [cA^cx].^, [cd^c#]J 
U{[Z^#]jZGfViU{/}}. 

2. For each matrix rrii : {X Y, A ^ ff) in M oi type 3, we consider the table 

U{[Z^#]jZGfViU{/}}. 

3 . We also consider the following tables: 

U{[a^#]i|a€fViU N 2 }, 

.Ri,a = {[ a ^ [ ]^a, 
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This time, the matrices rm of types 2 and 4 are simulated by a single table, 
of type the first rule must be used (that is, the symbol X must be present), 
otherwise the trap-object is introduced; similarly, the rule cA —>■ cx must be 
used, otherwise the catalyst will evolve together with the available object d and 
again the trap-object is introduced. Each matrix rrii of type 3 is simulated by the 
corresponding table Ri^i. After introducing the object /, no table as above can 
be used (without introducing the trap-object), hence we have to use Rij, which 
checks whether the derivation in G is terminal. At any time, the copies of object a 
are sent out by means of the table Consequently, N{II) = {n | a” G L{G)}, 

and this completes the proof. □ 

The previous result is relevant in view of the fact that transition P systems 
with only one catalyst (not with active membranes) are not known to be uni- 
versal, while the universality was proved for the case of using two catalysts [5]. 

5 Tables with Obligatory Rules 

The idea of distinguishing some rules of each table and imposing that these 
rules are applied at least once when the tables are selected has at least two 
motivations. First, this is a way to also ensure the fact that a selected table does 
not leave unchanged the objects from the region where it is applied. Then, it 
reminds the matrices from matrix grammars, whose rules are all applied when 
applying a matrix. However, having several obligatory rules in the same table 
is a way to make the system cooperative: if both a u and b v must be 
simultaneously used at least once, then ab uv must be used at least once 
(but the two cases are not equivalent, because besides evolving one a and one 
6, by rules a ^ u or b ^ v we can separately evolve further copies of a or of b, 
respectively) . 

That is why in what follows we allow at most one obligatory rule in each 
table. Such a rule is marked with a dot; when the table is used, its obligatory 
rule must be used at least once, otherwise the table is not allowed to be chosen. 

This apparently small change in the definition of tabled P systems is powerful 
enough in order to lead to fast solutions (making use of membrane division) to 
computationally hard problems. 

Theorem 6. Tabled P systems with active membranes using obligatory rules (at 
most one in each table) can solve SAT in linear time; the construction is uniform, 
and the system is deterministic. 

Proof. Let us consider a propositional formula 7 = Ci A • • • A Cm, consisting of 
TO clauses Cj = yj^i V • • • V yj,kj, 1 < j < w, where yj^i G {xi,~ixi | 1 < ^ < n}, 
1 < i < kj (there are used n variables). Without loss of generality, we may 
assume that no clause contains two occurrences of some xi or two occurrences 
of some —•xi (the formula is not redundant at the level of clauses), or both xi 
and ^xi (otherwise such a clause is trivially satisfiable, hence can be removed) . 
Therefore, in each clause there are at most n literals. 
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We codify 7, which is an instance of SAT with size parameters n and m, by 
the multiset 

^(7) = I Vj.r = Xi, for some 1 < i < n, 1 < j < m, 1 < r < kj} 

U {s' j I jjj^r = for some 1 < i < n, 1 < j < m, 1 < r < fc^}. 

(We replace each variable Xi from each clause Cj with s^y and each negated 
variable ^Xi from each clause Cj with s' then we remove all parentheses and 
connectives. In this way we pass from 7 to ^(7) in a number of steps which is 
linear with respect to n • to.) 

We construct the P system U with the following components: 
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There is no object in the skin membrane, while region 2 contains only the 
counter cq, which will continuously increase its subscript, by means of table 
The “main work” is done in membrane 3. In the beginning, we have here the 
object oi, hence the only applicable table is i?3,d, which divides the membrane, 
at the same time expanding the object ai to the truth values ti = true and 
fi = false of variable xi. In the next step, the only tables which can be applied 
in the two membranes with label 3 are and i?3,i,/: the obligatory rules 

select the tables in a precise way. At the same time with the passage from ti , /i 
to copies of 02, we also introduce all clauses which are satisfied by ti and /i, 
respectively (encoded by the variable ri). The process continues now with 02, 
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then with 03, and so on, until expanding all variables and introducing all clauses 
satisfied by these truth-assignments. 

Therefore, after 2n steps we get 2” membranes 3, containing the clauses 
satisfied by the 2” possible truth-assigments for the n variables. 

In step 2n -|- 1 the only table which can be applied for membranes 3 is Rs^: 
a„+i is replaced with di (which will check whether there is any membrane where 
all clauses are satisfied), and all rj are primed. 

From now on, for at most m steps, we use the tables Rsj, ^ < j < m. 
(Because these tables use primed versions of objects rj, they were not applicable 
before using table i?3^o ~ and this was the reason of priming.) Each of these 
tables removes the occurrences of one r'y, because this operation is done by an 
obligatory rule, this is a way to check that the respective r' is present. At the 
same time, the subscript of the object d from each membrane 3 increases by 
one. If in a given membrane 3 there are copies of r' for all j = 1, 2, . . . , m, then 
the respective object d reaches the subscript m -I- 1, which indicates the fact 
that the corresponding truth-assignment has satisfied all clauses of 7. If a given 
membrane 3 does not contain copies of all r', 1 < j < m, then that membrane 
cannot evolve m steps, hence the local object d remains of the form dj with 
j < TTl- 

Simultaneously, the object from region 2 arrives at the form C 2 n+m+i- 

If at least one membrane 3 contains the object dm+i (hence the formula is 
satisfiable), then in step 2 n -I- to -I- 2 we use the table R^^m+i and the object 
dm+i is sent to membrane 2 (at the same time, in region 2 we get C 2 n+m+ 2 )- If 
no membrane 3 sends out the object dm+i, hence the formula is not satisfiable, 
then the objects dj with j < m remain inside these membranes - but C 2 n+m+i 
evolves to C 2 n+m +2 in region 2. 

Now, in step 2n -I- to -I- 3, if any object dm+i is present in region 2, then one 
of them will dissolve membrane 2, and will produce the object yes, which is left 
free in the skin region; in the next step, this object will leave the system, thus 
signaling that the formula is satisfiable. Because membrane 2 is dissolved, the 
object C2n+m+3 (obtained in step 2n-|-TO-|-3) also remains free in the skin region, 
where it cannot evolve any more. If no object dm+i is present in region 2, then 
this membrane is not dissolved, c will get the subscript 2n -I- to -I- 3 and then in 
step 2n -|- TO -|- 4 will exit membrane 2 transformed in no; in the next step, this 
object exits the systems, signaling that the formula is not satisfiable. 

Thus, either we get yes outside the system in step 2n -I- to -I- 4, or no in step 
2n -I- TO -I- 5, and these objects correctly indicate whether or not 7 is satisfiable. 

The system U can be constructed in polynomial time by a Turing machine, 
starting from n and to, and it works in a deterministic manner (after each 
reachable configuration there is at most one next configuration which can be 
correctly reached). □ 

If we are more interested in the time our system works than in the time 
of constructing it or in its deterministic behavior, then the answer to a given 
instance of SAT can be obtained in n -I- to -I- 4 steps, by considering a system 
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constructed in a semi-uniform manner (starting directly from an instance of the 
problem) in the following way. 

For a given formula 7 as above, for ^ = 1, 2, . . . , n, we denote 

satfti) = {rj I there is 1 < i < kj such that yj^i = xi}, 
sat{fi) = {rj I there is 1 < i < kj such that yjj = ~^xi}. 

Then, we construct the system U with: 

O = {oi, U, fi \ 1 < i < n} U {r^ \ 1 < i < m} 

U {bi |0<i<n + m + 2}U{ci |0<z<n + m + 3} 

U {yes, no}. 





— 


[ [ 


i J3 


J2 J 1’ 








Wi 


= 


A, 


W2 = 


= Co, W3 


= 600102... a„. 




Ri,i 


= 


{[ 


yes ] 


i-[ ] 


lYss, 


[ no ^ [ ]^no|. 




R2,1 


= 


{[ 


^n+m+2 ] 2 


> yes, 


[ Cn+m-l -3 (2 *■ [ I 2^°} 






u 


{[ 




C-i+l (2 


VI 

0 


i < n m 2}, 




Rs,i 


= 


{[ 


0>i ] 2 


-[td 


si fi 


Is} 






u 


{[ 


b,^ 


^i+1 Is 


VI 

0 


j < n — 1}, for each z = 


: l,2,...,rz. 


-Rs.n-l-l 


= 


{[ 


bn^ 


bn-\-l ] 2 


;} 








u 


{[ 


U 


satfti) ] 


I3. I. 


U sat(fi) ] 3 1 1 < z < 


zz}, 


.^ 3 ,n-t-l-t-z 


= 


{[ 


• 

Ti 


A]s} 










u 


{[ 


bn+j 


^ bn+j-\-l ] 3 


1 1 < j < to}, for each z 


= 1,2,. ..,TO 


.^ 3 ,n-t-m-t -2 


= 


{[ 


^n+m+1 ] 3 


Is 


6n-t-m-t-2} ■ 





This time, in the first n steps we divide membrane 3 again and again, by 
means of the obligatory rules of tables i? 3 y, 1 < z < n, which expand the 
objects Qi to the truth values U = true and fi = false of variable Xi. The order 
of using tables is arbitrary, but after n steps we get the same configuration 
irrespective of this order: 2" membranes 3, containing the 2" truth-assignments 
of the n variables, as well as the object (at the same time, in membrane 2 we 
have obtained c„). 

In step n -I- 1, in region 3 we can only apply i? 3 _ri+i, which replaces with 
6 „+i and each ti, fi by the clauses satisfied by these truth values (specifically, ti 
is replaced by sat(fi) and fi by sat(fi)). 

From now on, for at most m steps, we use the objects bn+j+i, 1 < j < to, in 
the same way as objects dj were used in the previous proof, in order to check 
whether or not at least one truth-assignment has satisfied all clauses. If this is 
the case, then at least one membrane 3 will contain the object bn+m-eit which 
will exit to membrane 2, will dissolve it in step n -I- to -I- 3, and will produce the 
object yes, which then leave the system. If not, Cn+mj-s will exit membrane 2 
(in step n -|- TO -|- 4) transformed in no, which will exit the system in one further 
step. 
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The system U can be constructed in polynomial time by a Turing machine, 
starting from 7 (only the tables directly depend on the formula), and the 
system is clearly confluent. 



6 Final Remarks 

Contributing to the “campaign” of removing polarizations from P systems with 
active membranes, we have obtained several universality results for systems with- 
out polarizations, but having the rules structured in tables. When tables with 
(at most one) obligatory rules are used, NP-complete problems can be solved 
in linear time - this is illustrated with SAT problem. 

Two important problems have remained open: (i) are systems without po- 
larizations and without tables (maybe with catalysts) universal? (ii) can NP- 
complete problems be solved in polynomial time by means of tabled P systems 
with active membranes without polarizations (and without obligatory rules)? 
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Abstract. In a previous paper [Pa2002] the author developed a theory 
of decomposition into prime factors of multistage interconnection net- 
work. Based on that theory some new properties of such networks are 
investigated and proved. 



1 Preliminary 

Multistage Interconnection Networks (MIN’s) play an important role in the de- 
sign of the hardware and operating systems of computers, in particular of parallel 
computers, and they enable efficient communication algorithms between proces- 
sors and memories, see e.g. [L92]. We shall be concerned in this paper mainly 
with one such MIN, the Butterfly network, some of its isomorphic networks and 
some of its extensions. In general, MIN’s can be described by an n-layered graph 
defined as below 

Definition 1. An n-layered graph is a graph G = (Ai, A2, ..., A„, A2, A3 • • • A„) 
where the Xi ’s are sets of vertices, and the Xi vertices are connected only to the 
Ai_|_i vertices by the edges 

Clearly every layer of an n-layered graph can be depicted as a bipartite graph 
with Ai, Aj_|_i as vertices and as edges. Thus an n-layered graph is a con- 
catenation of n — 1 bipartite graphs and every such bipartite graph will be called 
a stage. Even and Litman introduced in [EL97] a technique, the “Cross Product” 
technique, and showed that this technique enables the representation of several 
well known n-layered networks as a cross product of simple such networks. This 
author extended their technique in [Pa2002] into a full decomposition theory. 
The basic definitions and properties and some main results from [Pa2002] are 
reproduced below. 

Definition 2 . Let Bi = (Ai,Yi,Ai) and B2 = {X2,Y2, E2) be two bipartite 
graphs. Their cross-product is the bipartite graph G3 = (A3, 13,^3) such that 
A3 = Ai X A2, I3 = Yi X I2 (‘'X ’ represents the Cartesian product operation) 
and iixu,X2j),{yik,y2i)) G A3 if and only if (xu,yik) G Ai and (x2j,y2i) G A2. 

We shall use the notation ‘x ’ for the cross-product operation of bipartite 
graphs as defined above. For a given bipartite graph B = (A, Y, A) we shall refer 
to X and Y as the floor and the ceiling, respectively, of B. 
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Definition 3. Let B\ and B2 be two bipartite graphs. We shall say that B\ is 
isomorphic to B2 (notation: B\ ^ B2) if there are 1 — 1 and onto mappings if and 
<f> from Xi to X2 and from Yi and Y2 such that {xi, yj) G Ei, iff (ip{xi), 4>{yj)) G 

E2- 



It is easy to see that the cross product operation is associative and commu- 
nicative up to isomorphism of the resulting graphs. 

Definition 4. Let G\ = {Xi, Xn, E2, En) and G2 = (X{, X(, E2, E() 
be two n-layered graphs. Gi x G2 = G3 = {X", X", E'f, ..., E”) is the n-layered 
graph such that 

Definition 5. Two n-layered graphs G\ = (Xi, . . . , Xn, E2, . . . , En) and G2 = 
{X[, . . . , X(, E'2, . . . , E() are isomorphic if for all i,l < i < n — 1 , the bipartite 
graphs Bi = {X\, Xi+i, Ei+i) and B[ = (X', are isomorphic and 

the isomorphisms ipi and <f)i can be defined in a way such that for all 2 < i < 
n — 1 ’ipi{Xi) = (j>i-i{Xi-i) (see Definition 3 ). 

Remark. The notion of isomorphism introduced here differs from the standard 
definition of graph isomorphism in that its is sensitive to the identities of the 
vertices of the graph. The notion introduced here would have deserved a different 
name. Nevertheless, we choose to use this name in order to comply with its name 
common in the literature investigating multilayered graphs. 

It follows from the definitions that the graph G = G1XG2X, . . . , XGk is iso- 
morphic to all graphs of the form GTr(i)X ■ ■ ■ XGt^(j:) where tt is any permutation 
of (l,...,fc). 

2 The BCP Family of Graphs 

Consider the primitive simple bi-partite graphs labeled a, b, c and 1 shown below: 

0 01 01 0 
a: A ',b: V ;c:| |;1:| 

01 0 010 

Denote E = {a, b, c}, let ct G 27 and let B„ be the graph whose label is a. Then 
for any word w G S* ,w = a\ ■ ■ ■ au w represents the graph B^, = Ba-^ x • • • x Ba-^ . 
Let Bfj = {Xfj , IG , E,j ) and B^, — {X^, , , E ^, ) . 

Then, by our definitions Xyj = X^^ x • • • x X„^, and Y^, = Y^^ x • • • x • We 
shall label the vertices of Xyj and Y^, according to the following procedure: The 
vertices in X„^ and Y^i are labeled by either 0 or 1 as per the definition of B^i . 
The vertices in X^ and Y^, correspond to /c-tuples of zeroes and ones. Order the 
vertices in X^, and Y^ according to the lexicographic order of the corresponding 
/c-tuples and then label those vertices with consecutive integers, starting with 
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0 12 3 



4 5 6 7 





0 12 3 



4 5 6 7 



Fig. 1. The graph B 



0, according to the order of their n-tuples, the notation and (1^) will be 

used to denote the sets and when ordered as above. 

The BCP family of graphs is defined a,s Li = {B : B = B^,w & E*}. 

An Example. Consider the graph B given above. 

The reader can verify that B = Bcbca = BcB^BcBa- 

For the sake of simplicity we shall represent graphs in the BCP family by 
their label, e.g., the graph Ba^bc'^ = Ba x Ba x Bb x Be x Be will be denoted by 
w = a^bc^ G E* . 



3 The n-LCP Family of Graphs 

Let W\,W 2 , ■■■,Wn be a sequence of graphs in the BCP family such that the size 
of the ceiling of wt is equal to the size of the floor of Wi+i for 1 < t < n — 1. 
We can construct an n-layered graph from the above graphs by identifying the 
vertices in the floor of Wi+i with the vertices in the ceiling of Wi in their given 
order. 

Denote this graph by the “page” 



Wi 

W2 



\_Wn\ 

Definition 6. The n-LCP (n-layered cross-product) family of graphs is the set 
of all graphs of the form 



such that Wi G BCP for 1 < i < n and the vertices in the floor of Wi+i (whose 
number is equal to the number of vertices in the ceiling ofwi) are identified with 
the vertices in the ceiling ofwi, in their given order. 
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An Example. The n-layered 17 network [La75] can be described as follows: 

1. The number of vertices in the floor and ceiling of every layer is equal to 2". 

2. All layers have the z-th vertex in the floor, i = 1 . . . 2”, connected to the 
vertices 2z — 1 and 2i modulo 2” in the ceiling. It is easy to see that the 
omega network is represented by the page, containing n words, below. 

'a c”-i b 
a c"-i b 



n rows 



4 Prime n-Factors 

We define below a subset of O(n^) graphs in the n-LCP family which have the 
“primality” property, i.e., all the graphs in the family can be represented as 
cross-products of those factors and the primes themselves cannot be factorized 
into simpler graphs. 

Definition 7. A prime graph in the n-LCP family is a graph which can be 
represented by a page as below: 

a. Choose a row 0 < i < n and write a in row i. Choosing the row 0 means that 
a is omitted altogether from the page. 

b. Assume we choose position i for a. Choose a row i < j < n 1 and write 
b in row j . Choosing the n 1 row means that b is omitted altogether from 
the page. 

c. Write 1(1 represents the graph \) in rows k,k < i or k > j, if such rows 
exist (i.e., if i > 1 or j < n) and write c in rows t,i < t < j if such rows 
exist (i.e., if j > i 1). 

Several primes in n-LCP are shown below. 



c 




'1' 




'1' 




a 


c 




a 




1 




c 


c 




b 




1 




c 


c 




1 




a 




_b_ 



Remark. As the number of possible locations for a is n -I- 1 and for the z-th 
location of a, there are n — i-\- 1 possible locations for b, we have that the total 
number of zz-primes is X) i J = (»-HK"-i-2) ^ follows from the definition that 
the primes do not factor into simpler factors, if we disregard the trivial prime 
consisting of a page with a single 1 in every row, which is not included in the 
above definition. 
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The Graphs corresponding to the primes described above are shown below. 





4 > 







5 Previous Results 

The following results have been obtained in [Pa2002]. 

— A polynomial algorithm is provided (and its correctness proved) that enables 
the factorization of any graph in n-LCP into prime factors. The complexity 
of the algorithm is shown to be 0{n^ log(if)) where n is the number of layers 
of the graph and E is the maximal number of edges in a layer of the graph. 

— The following theorem, showing that the factorization provided by the algo- 
rithm is unique, up to isomorphism, is also proved in [Pa2002]. 

Theorem 1. Let N\ and N 2 be isomorphic networks in n-LCP. Let = /i x 
■ ■ ■ X /fci) A ^2 = 5i X • • • X two factorizations of Ni and N 2 , respectively, into 

prime factors, not necessarily distinct. Then k\ = k 2 , and the two factorizations 
contain the same factors with the same multiplicity. 

In the sequel we shall use the notation below. 

Notation. Consider the prime graphs in the n-LCP family as described in Def- 
inition 7. We shall denote those graphs as below: 

Aij-, 1 < t < i < n, denotes the diamond shaped-graph with an a at layer z, a 6 
at layer j 1, c’s between the a and the b and ones below the b and above the a. 
Xi^n, ^ ^ i ^ n, denotes the fork-shapes graph with an a at layer z, c’s below it 
and ones above it Yi.^, 1 < z < j < n, denotes the Y-shaped graph with a 6 at 
layer j, c’s above it and ones below it. Sn is the prime with c in all its layers. 



Examples: The Baseline Network 

The Baseline network ([WF80]) can be described recursively as follows. The one 
layered Baseline has two vertices in its floor and in its ceiling and both floor 
vertices are connected to both ceiling vertices. The (z -I- l)-layered Baseline is 
constructed from two identical z-layered baselines set on layers 2 to z -1-1. The first 
layer set on top of the two z-layered baselines is equal to the (constant) layer of 
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the 17 network. Thus the 3-layered baseline and n-layered baseline representation 
is shown below. 

a ... b 

ac cb c a c"-^ b 

c a c b 
c c a b 

c"-i ... ab 
3-layered n-layered 

Applying the factorization algorithm we get for the n-layered Baseline BL(n) 
the factorization below: 

X X • • • X ^n,n ^ ^l,n ^ ^l,n—l X • • • X r 

The Butterfly network [L92] can be described by 

a b 

cab c"-2 
c"-i a b 

and decomposes as: 

BY (n) = X • • • X Xn,n X Fi,i X • • • X Yi,„ 

In a similar way one can show that the fl network decomposes as: 

Q{n) = Xn,n X • • • X Ai,„ X Yi^ri X • • • X Y14 

Thus, the three networks, the Omega, the Butterfly and the Baseline decompose 
into the same factors and are therefore isomorphic. 

6 Some New Results 

As shown in the previous Sections 3 and 5 the Omega network, the Baseline 
network and the Butterfly network have the same factorization consisting of all 
the y-shaped factors (the Yi,i factors) and all the fork-shaped factors (The 
factors), and therefore are isomorphic. In fact any network factorizing into a per- 
mutation of the above factors is isomorphic to the Butterfly network. We shall 
denote all the networks that are isomorphic to the Butterfly by i?-networks. 
While, as mentioned above, the order of the factors in any B-network is im- 
material, thus order becomes relevant when B-networks are combined. We shall 
consider two types of combinations which have been studied in the literature. 

(a) Two identical networks connected in tandem i.e. the networks are combined 
in a way such that the ceiling of the top layer of one network is identified 
with the floor of the bottom layer of the second. If is a i?-network, then 
we shall denote this operation as H/H. 
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(b) Two identical networks are connected in a way such that one network is 
connected to the mirror image of the second network where the floor of the 
bottom layer of one network is identified with the ceiling of the top layer of 
the mirror image (i.e. top-bottom reversal) of the second network. We shall 
denote this connection as H/H (H denotes the mirror, top down reversal, 
of H). 

A well know combined network is the Benes network [L92]. The Benes network 
can be described as H/H where H is the Baseline network (see description in 
Sect. 5). An important property of the Benes network is that it is rearrangeable, 
i.e. for any mapping tt of the inputs to the outputs it is possible to construct 
edge-disjoint paths in the network linking the z-th vertex in the floor of the 
network to the 7r(z)-th vertex in the ceiling of the network (see [L92, Theorem 
3.10, p.452]). It is well known that the Benes network is isomorphic to BY/BY , 
where BY is the Butterfly network (of the same order). See e.g. [EL97]. Based 
on the theory of decomposition into prime factors of networks described in Sect. 
5, we can now generalize the isomorphism range and prove the theorem below. 

Theorem 2. Let Hi{n) and H 2 {n) he any two n-layered B -networks. Then 
Hi{n) / Hi{n) is isomorphic to H 2 {n)/H 2 {n). 

Proof (of theorem). As mentioned before the set of factors of any n-layered B- 
network H{n) contains the Y\^i and the 1 < i < n, factors and no other 
factors. In the mirror image of H{n) the Y\^i factors will transform into Xn-i+i^n 
factors, the Yi j factors will transform into Xn-j+i factors, and the order of the 
corresponding factors in H{n) is the same as the order of the source factors in 
H{n). The factors generated in H{n)/H{n) are depicted below where, for the 
sake of saving space, we represent the prime factors in transposition, as rows 
instead of columns, {)m representing a prime with m layers. 



(i) A factor in H{n) represented as (1, . . . , 1, a, c, . . . , c),^ of H{n) generates 

n-i-l-l ^ 

the factor (c, . . . , c, b , 1, . . . , 1)„ in H{n). Those two factors combine 

i 2n— i-t-l ^ 

in i7(n)/i7(n) into the factor (1, ..., 1, a, c, ..., c, b , 1, ..., 1 ) 2 „ whose 
label is Xi 2 n-i (see deflnitions in Sect. 5). 

j ^ 

(ii) A factor Yij of H{n) represented as (c, . . . , c, 5, 1, ■ • ■ , 1)„ generates the 

n—j+l ^ 

factor (!,...,!, a , c, . . . , c)„ of H{n). Those factors do not combine in 
H{n) /H{n) but rather “stretch out” into the factors (c, . . . , cjb, 1, . . . , 1)^„ 

2n-j-ei ^ 

and (!,...,!, a , c, . . . , ) 2 „ whose labels are Yij and X 2 n-jh, 2 n respec- 
tively. All in all, the set of all factors of H{n)/H{n), for any n-layered 
B-network H consists of the following factors: 

(i) the diamond shaped factors Aj 2 n-i 1 < z < n 

(ii) The T-shaped factors Yij, I < j <n 

(iii) The fork shaped factors X 2 n-j+i, 2 n, f < j < n. 
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Thus all the 2n-layered networks H{n)/H{n) such that H{n) is an n-layered 
_B-networks are isomorphic. The theorem is thus proved. □ 

It was also mentioned in the literature that the H/H combination is not 
isomorphic to the H/H combination for some i?-networks (e.g. the Butterfly) 
while for some other networks the above combinations are isomorphic (e.g. the 
Baseline network — see [EL97]). We provide now a characterization of the B- 
networks H such that H/H is isomorphic to H/H. 

Definition 8. Let H be an n-layered B-network. Assume that the order of the 
fork-shaped factors in the factorization of H is Wi,n, • ■ • > the 

order of the Y shaped factors is Yiji, Yij 2 , ■ ■ ■ , Yijn- H will be called a match- 
ing network if • ,*n) = Un,jn-i, ■ ■ ■ ,ji)- H will be called touching if 

(*1 ) *2) ■ ■ ■ j *n) — jli j2i ■ ■ ■ 1 jn) ■ 

Theorem 3. Let H be an n-layered B-network. Then H/H is isomorphic to 
H/H if an only is H is matching. 

Remark 1. The reader can verify that the Baseline network and the Butterfly 
network described in Sect. 5 are matching and touching correspondingly. 

The proof of Theorem 3 is similar to the proof of Theorem 2. The reader can 
verify that, if and only if H is matching, then the order of the T-shaped factors 
in H generated by the fork shaped factors in H is the same as the order of the 
T-shaped factors in H and that the set of factors of H/H is equal to the set 
of factors oi H/H which was shown in the proof of Theorem 2, given that H is 
matching. 

Corollary 1. Let Hi and H 2 be two n-layered B-networks. Then H\/ H\ is iso- 
morphic to H 2 /H 2 and, if and only if H 2 is matching, then Hi/ Hi ~ H 2 /H 2 ~ 
H 2 /H 2 . The power of the Theorems 2,3 and their corollary stems from the fact 
that all the combined B-networks of type H/H are rearrang cable, a very impor- 
tant property for communication purpose. 

Finally we can prove the following. 

Theorem 4. Let Hi and H 2 be any two n-layered B-networks. Lf both Hi and 
H 2 are touching then Hi/ Hi is isomorphic to H 2 /H 2 . 

The proof which is similar to the proof of Theorem 2 is left to the reader. It 
can easily be verified that the Butterfly network is touching and it is known 
that BY{n)/BY{n) is not isomorphic to BY{n)/BY{n). Thus follows directly 
from Theorem 3. The question whether BY(n)/BY(n) is rearrangeable is an 
open problem (see [EL97]). It may be easier to approach this problem using 
an isomorphic network which is homogeneous. E.g. as the L2 network is also 
touching we may try to analyze the network fi(n) / f2(n) which is isomorphic to 
BY{n)/BY{n). As all the layers of Q{n) / Q{n) have the same form, a b, 
we can rephrase the problem in a more general form, i.e.: 
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1. Is it possible to construct a rearrangeable network by concatenating layers 
of the form a 6? 

2. If the answer to (1) is “yes” then find the minimal k such that the network 
consisting of k layers of the form a c„_i b is rearrangeable. 
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Abstract. Regularly extended EOL grammars allow an infinite number 
of rules for a given nonterminal provided that the set of right sides of the 
rules for each nonterminal is a regular language. We show that structural 
equivalence remains decidable for regularly extended EOL grammars. 



1 Introduction 

The decidability of structural equivalence of context-free grammars is a classical 
result [9, 13] and a simplified automata theoretic decidability proof is known 
from [19]. The question remains decidable for EOL grammars [10-12,17] which 
are parallel context-free grammars. In the case where the parallel derivation is 
controlled by a finite set of tables, that is, we have ETOL grammars, structural 
equivalence is already undecidable [16]. On the other hand, if the tables are 
restricted to be homomorphisms [18], or if we consider the strong equivalence 
that compares also the sequences of tables used [8], this question again becomes 
decidable. 

All of the above results are obtained for context-free type grammars with a 
finite set of rules, and the corresponding syntax trees have nodes of bounded 
arity. Due to many applications in document grammars, recently there has been 
much interest in regularly extended context-free type grammars [1-3,5], as well 
as, in tree automata operating on unranked trees [3,4]. 

It has been shown that structural equivalence remains decidable also for regu- 
larly extended (sequential) context-free grammars [5] . In this paper we establish 
the decidability of structural equivalence for regularly extended EOL grammars. 
Our proof uses tree automata, following the approach from [19,17]. Naturally 
the syntax trees of (regularly extended) EOL grammars cannot be recognized by 
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finite tree automata and the automata used in [17] relied on an explicit height 
counting mechanism. To simplify the constructions, here instead of an explicit 
height counting capability we restrict the form of the input trees and show that 
equivalence of the tree automata on the restricted set of input trees can be 
decided effectively. 

The decidability proof for regularly extended context-free grammars in [5] 
uses a grammatical approach where the given grammars are transformed into a 
normal form and arbitrary grammars are shown to be structurally equivalent if 
and only if the normal forms are identical. Similarly a grammatical approach is 
used in [10-12] for deciding the structural equivalence of EOL grammars with 
finitely many rules. Already for sequential and parallel grammars with finitely 
many rules, the proof establishing the uniqueness of the normal form grammar 
is more involved than the decidability proof based on tree automata, but it has 
the advantage of explicitly giving a grammatical normal form. We do not know 
whether a normal form grammar can be constructed for regularly extended EOL 
grammars such that it is unique for structurally equivalent grammars. 



2 Regularly Extended EOL Grammars and Tree 
Automata 

We briefly recall some definitions concerning regularly extended EOL grammars 
and tree automata. For all unexplained notions on formal languages we refer the 
reader to [20]. A general reference for L systems is [14] and for information on 
tree automata we refer the reader to [6,7]. 

The cardinality of a finite set A is #A and the power set of A is p(A) . When 
there is no danger of confusion, a singleton set {6} is denoted simply as b. All 
alphabets we consider are finite. The set of words (respectively, nonempty words) 
over an alphabet A is A* (respectively, A+) and A denotes the empty word. 

A regularly extended EOL grammar, or reEOL grammar, is a tuple 

G={V,E,S,P), (1) 

where V is an alphabet of nonterminals, S is an alphabet of terminals {V H E = 
0), S' e y is the start nonterminal and P CV x {VU E)* is a set of productions 
such that for any B G V the language 

{wG {V\JEY I {B,w) GP} 



is regular. 

Note that we allow that the set of productions can be infinite. As usuai we 
denote productions {B,w) G P as B w. 

The productions of P define in the well-known way the parallel one-step 
rewrite relation =^C (V U E)* x (E U E)* defined by setting w\ W 2 if and 
oniy if 



Wi = Bi - ■ ■ Bn, W2 = Ui ■ ■ ■ Un, Bi ^ Ui G P, i = I, . . . ,u. 
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The language generated by the grammar G is 

L{G) = {w&E* I S^* w}. 

Note that our definition does not allow the rewriting of terminals, that is, 
the reEOL grammar is synchronized [14]. Thus the one-step rewrite relation is 
in fact a subset of V* x {V VJ S)*. It is well-known that this restriction does not 
cause any restriction in terms of the family of languages generated. 

We assume that notions such as the root, a leaf and the height of a tree are 
known. The height of a tree with a single node is defined to be zero. If C is a 
(finite) set, by a (7-tree we mean a rooted ordered tree the nodes of which are 
labeled by elements of C. Note that the same symbol of G may be used to label 
nodes having different numbers of children, that is, we consider trees where the 
node labels have variable arity. The set of all (7-trees is denoted Fc- 

Next we define the syntax-trees of an reEOL grammar. In the following G is 
always as in (1). 

Denote S' = EUN'U{A}. Here A is a new symbol that will be used to label 
nodes corresponding to the empty word. A leaf of t G Fs is said to be a non-A 
leaf if it is labeled by some element other than A. 

The set of syntax trees of (7, S{G), is a subset of F~ defined inductively as 
follows. A tree with one node labeled by a symbol of S is in S{G). Assume that 
t € S{G) and the non-A leaves of t are mi, . . . , Um, m > 1, where Ui is labeled by 
Bi G V, and Bi ^ a\ - ■ ■ a}.. £ P, ki > 0, Qj G V U S, j = 1, . . . ,ki, i = 1, . . .m. 
Then the tree t' is in S{G) if t' is obtained from t by attaching, for each node Ui, 
ki children that are labeled, respectively, by the symbols a\, ■ ■ - a\.. If ki = 0, 
the node Ui has one child labeled by A. 

According to the above definition the root of a syntax tree can be labeled 
by any symbol of S. This is useful in inductive constructions where we consider 
derivations beginning with any grammar symbol. Below the terminal syntax 
trees are required to represent derivations that begin with the start nonterminal 
and produce a terminal word. 

A syntax tree t G S{G) is said to be terminal if the root of t is labeled by the 
start nonterminal S and all leaves of t are labeled by elements of A U {A}. The 
set of terminal syntax trees of G is denoted TS{G). The terminal syntax trees 
correspond in the obvious way to parallel derivations of the grammar G yielding 
terminal words. 

For t G F~ we denote by yield(t) the word over V U E obtained by concate- 
nating from left to right the labels of all non-A leaves of t. Then 

L{G) = {yield(t) | t G TS'((7)}. 

The structure of a syntax tree t G S{G), str(t) is the tree obtained from t 
by replacing the label of each internal node of t by w. Here tu is a new symbol 
not appearing in E U A. Essentially, the structure trees can be considered to 
have labels only for the leaves. The special symbol ru is used only to make it 
possible to define transitions of a tree automaton without a separate assumption 
concerning unlabeled nodes. 




262 



Kai Salomaa and Derick Wood 



The set of terminal structure trees of G is 

STS{G) = {str(t) I t G STS{G)}. 

The following definition generalizes the notion of EOL structural equivalence 
[10-12, 17] for regularly extended grammars. 

Definition 1. Let G\ and G 2 be regularly extended EOL grammars. The gram- 
mars G\ and G 2 are structurarly equivalent if 

STS{Gi) = STS{G2). 

To conclude this section we recall some basic definitions concerning tree au- 
tomata that will be needed for the proof of our main result. Differing from the 
usual model of tree automata operating on trees over ranked alphabets, we con- 
sider trees over unranked alphabets, see e.g. [4]. 

Let 17 be a finite alphabet that is used to label the trees. A regularly extended 
bottom-up tree automaton is a tuple 

M={n,Q,QF,5), (2) 

where Q is a finite set of states, Qf C Q is a set of accepting states and 5 
associates to each r G 17 a relation ts G Q* x Q such that for every q G Q, 
T € f2 the set 

L(g,r) = {w G Q* I {w,q)€Ts} (3) 

is regular. In the following, unless otherwise mentioned, by a tree automaton we 
always mean a regularly extended bottom-up tree automaton. 

By an M-configuration we mean an (17 U Q)“tree where elements of Q oc- 
cur only as labels of leaves and the set of M-configurations is conf(M). The 
computation relation of M, \~m^ conf(M) x conf(M), is defined as follows. Let 
t,t' G conf(M). Then t \~m t' if t' is obtained from t by replacing a subtree 
t(( 7 i, . . . , qm), T G f2, qi, . . . , qm G Q by a single node labeled hy p G Q such that 

{qi ■■ ■qm,p) G TS- 

Note that if above r labels a leaf of t (that is, m = 0), then r is replaced by a 
state p such that (A,p) G ts- 

The forest, or tree language, {G Fq) recognized by M is 

L{M) = {t G Fq I {3q G Qf) t \~*m q}. 

Above q denotes the tree with a single node labeled by q. 

A tree automaton (2) is deterministic if for each r G 17 and w G Q* there is 
at most one p G Q such that (w,p) G ts- It is well known that for an arbitrary 
nondeterministic bottom-up tree automaton we can construct an equivalent de- 
terministic automaton. A forest is regular if it accepted by a (deterministic or 
nondeterministic) bottom-up tree automaton. 
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The following lemma is proved using the standard direct product construc- 
tion. First we just need to add a “dead state” for the second automaton M 2 
which ensures that the computation of M 2 always reaches the root. The set of 
accepting states of the direct product automaton consists of all pairs where the 
first component is an accepting state of M\ and the second component is not 
an accepting state of M 2 - It is easy to verify that the state transition relation 
obtained from the direct product construction satisfies condition (3), that is, the 
constructed automaton is also a regularly extended tree automaton. 

Lemma 1. Let Mi = * = 1,2, he deterministic tree automata. 

Then we can effectively construct a deterministic tree automaton M such that 

L{M) = L{Mi) -L{M2). 



3 The Main Result 

Clearly the set of syntax trees (or structures of syntax trees) of an reEOL gram- 
mar is not regular since a finite tree automaton cannot check that the tree 
represents a parallel derivation. However, we can use tree automata to decide 
structural equivalence by restricting consideration to balanced input trees or, 
strictly speaking, to input trees where each path from the root to a non-A node 
is of equal length. Such trees correspond to parallel EOL derivations and then it 
is sufficient for the tree automaton to check that the derivation is correct locally. 
For the decidability result it is essential that the automaton can be constructed 
to be deterministic. 

The following notation is introduced to deal with “almost balanced” trees 
corresponding to derivations of an reEOL grammar, where leaves labeled by A 
can occur at any depth. Let 17 be an alphabet and r G 17. The set 

BALr2(r) C Fo (4) 

is defined to consist of all 17-trees t such that 

(i) The symbol t occurs only as a label of leaves in t. 

(ii) If Ml, U 2 are any leaves of t not labeled by r then the distance of ui and U 2 
from the root of t is the same. 

The above conditions mean that t is a balanced tree with the exception that 
leaves labeled by the special symbol t can occur at any level. 

Let G be an reEOL grammar as in (1) and 17 = wU A7U A. Then we can note 
that all structures of terminal syntax trees of G belong to BALi 7 (A). 

The following lemma is an extension of the well known corresponding result 
for (extended) context-free grammars with the additional requirement that we 
consider only “almost parallel” input trees. 

Lemma 2. Let G = {V, S, S, P) he an reEOL grammar and 17 = tu U A U A. 
Then we can effectively construct a deterministic tree automaton M such that 

L{M) DBALniX) = STS{G). (5) 
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Proof. Choose 

M = ((u7 U SU A), (p(V) U r U 9 a), V, (5), 
where S = {a \ a £ S}, V = {J7 C C | S £U} and 5 is defined as follows. 

(i) For cr G i7, we set as = {A} x {a}. 

(ii) Ai = {A} X {^a}- 

(iii) Let Ui, . . . , Um G p(C), m > 1. Then {Ui ■ ■ ■ Um, U) G ws for 

U = {B€V I {3B,€U,,i = l,...,m) B ^ 

The set U is the only set X such that (C/i • • • Um, X) G ws- 

Since P is the production set of an reEOL grammar we know that for any set 
U CV the set 

{C/i •••[/„ I (VS G U){3B, G C/„ i = 1, . . . , m) B ^ G P} (c p{V)*) 

is an intersection of finitely many regular languages, and hence regular. This 
means that the rules defined in (iii) can be used in a regularly extended tree 
automaton. Directly by their definition, all the rules (i), (ii) and (iii) are deter- 
ministic. 

Note that rules (i) mean that at a leaf labeled by a the computation begins 
in state a. Similarly, rule (ii) says that at leaves labeled by A the computation 
begins in state q\ (which is a new symbol that we use instead of the rather 
cumbersome notation A). Let t be any tree of height at least one. Using the 
definition of the rules (iii) and induction on the height of a tree t G BALi 7 (A) 
we see that M reaches the root of t in a state U G p(V) where 

U = {B€V I 

(3t' G S{G)) such that the root of t' is labeled by B and str{t') = t}. 

Note that if t G BALj 7 (A) has some internal nodes labeled by symbols other 
than w then U = % and the computation of M becomes blocked before reaching 
the root. 

By the choice of the set of accepting states V the above means that the 
equation (5) holds. □ 

It is well known that emptiness of regularly extended tree automata is de- 
cidable. In the below two lemmas we show that also emptiness modulo the set 
of “almost balanced” trees can be decided effectively. 

We say that a tree t is k-bounded if any node of t has at most k children. 

Lemma 3. Given a tree automaton M = {fi,Q,Qp,6) we can effectively com- 
pute a constant k such that the following condition holds. 

Let T G 17 and suppose that t G L{M) n BALj 7 (t). Then there exists a k- 
bounded tree t' G P(M) nBALf 2 (r) such that the height oft' is less than or equal 
to the height oft. 
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Proof. For q G Q and w G 17 let denote the number of states of the minimal 
deterministic finite automaton for the language L{q,u>) from (3). Choose 

k= max rinu;- (6) 

qGQ 

Consider an arbitrary t G L{M) n BALi 7 (r) and let u be a node of t having m 
children, where m > k. Let C be an accepting computation of M on t and let 
{qi ' ■ ■ qm,p) G cos be the computation step used at the node u, where w G 17 is 
the label of u and qi, . . . , qm,P G Q. Since m > k, the equation (6) implies that 
there exist 1 < i < j < m such that qi ■ ■ ■ qiqj+i ■ ■ ■ qm G L{p, co), that is, 



(qi---qiqj+i---qm,p) Geos. (7) 

Let t' be the tree that is obtained from t by deleting the (z + l)st, . . . , jth 
immediate subtrees of the node u. By (7) and the fact that C is an accepting 
computation, M has an accepting computation on t' . Also we note that if r is 
any tree in BALr 2 (r) and we delete a number of immediate subtrees of a node 
V of r, the resulting tree is still in BALi 7 (r) assuming that we do not delete all 
the immediate subtrees of v. Thus t' G BAL^ 2 (t). 

By repeating the above process as long as there are nodes having more than 
k children, we obtain a fc-bounded tree that is in the forest L{M) n BALi 7 (r) 
and each step in the process either preserves the height of the tree or decreases 
it. The latter can happen if all the remaining subtrees of the given node have all 
leaves labeled by the special symbol t. 

Also it is clear that given the transition relation of the tree automaton M 
we can effectively compute the constant k. □ 

For the below result recall that the notation BALr 2 (r) is as defined in (4). 

Lemma 4. Given a tree automaton M = {f2,Q,Qp,6) and t G f2 we can 
effectively decide whether or not 

L(M) nBALr2(r) = 0. 

Proof. Let t G BALi 7 (t). The zth level of t is defined to consist of nodes that 
are at distance i from the root. Thus 

all subtrees rooted at level i nodes and having 

a non-T leaf have the same height. (8) 

By a non-T leaf we mean a leaf node labeled by a symbol in 17 — t. 

Assume that M accepts t and let C be the accepting computation of M on t. 
Let Ai denote the set of states that the computation C reaches at nodes on the 
zth level, where 0 < z < height(t). Now if height(t) > 2^*^ we can guarantee that 
there exist 0 < z < j < height (t) such that Ai C Aj. Let t' be the tree obtained 
from t by replacing each subtree r rooted at the zth level with a subtree s rooted 
at the jth level such that computation C reaches the root of r and the root of s 
in the same state. Thus M accepts t' . Also, by (8), t' G BAL^^(r). 
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By repeating the above process we see that if L(M) n BAL 17 ( t ) is not empty 
then it contains a tree of height at most 2^'3. Now the result of Lemma 3 
implies that L{M) nBALj 7 (T) contains a fc-bounded tree of height at most 2 ^'^ 
where k can be effectively computed when the tree automaton M is given. This 
means that in order to decide emptiness of L{M) n BALr 2 (r), it is sufficient to 
check whether L{M) accepts a constant number of candidate trees belonging to 
BALr 2 (r) where the candidate trees can be effectively found. □ 

Now we are ready to prove the main result. 

Theorem 1. Structural equivalence of regularly extended EOL grammars is de- 
eidable. 

Proof. Assume we are given reEOL grammars Gi and G 2 with terminal alphabet 
E. Without loss of generality we can assume that both grammars use the same 
terminal alphabet, otherwise we consider the union of the terminal alphabets. 
Let f2 = 07 U A U A. By Lemma 2, we can construct deterministic regularly 
extended tree automata Mi such that 

L{Mf)n^ALn{\) = STS{Gf), t = l,2. 

By Lemma 1 we can construct (deterministic) tree automata N \^2 and A^ 2 ,i such 
that 

L(A^,.,)nBALf 7 (A) = (L(M,)nBALr 2 (A))-(L(M,)nBALr 7 (A)), {i,j} = { 1 , 2 }. 

Now Gi and G 2 are structurally equivalent if and only if L(Njj) nBAL 17 (A) = 0, 
when (z, j) = (1, 2) and (z, j) = (2, 1). By Lemma 4 these conditions can be tested 
algorithmically. □ 

Lemma 2 can be immediately modified to show that we can construct a 
deterministic tree automaton M such that L{M) n BALi 7 (A) = TS{G), where 
G = {V, A, S, P) and 17 = A U A U A. Recognizing the syntax trees of G is in fact 
easier than recognizing the structures of syntax trees of G (in both cases modulo 
the set of “almost balanced” trees) since in the case of syntax trees the rules (iii) 
in the construction of the proof of Lemma 2 do not need to consider all possible 
nonterminals appearing as left side of the rule, in syntax trees the nonterminal 
is given as the label of the node. This means that the proof of Theorem 1 gives 
immediately the following: 

Corollary 1. For given reEOL grammars Gi and G 2 we ean effectively decide 
whether or not TS{G\) = TS{G 2 ), i.e., whether or not the grammars are syntax 
equivalent. 

To conclude we can note that the algorithm obtained from the proof of Theo- 
rem 1 is extremely inefficient and the complexity of the structural equivalence of 
reEOL grammars remains open. From [15] we get a lower bound for the complex- 
ity since the structural equivalence of EOL grammars with finitely many rules is 
already hard for deterministic exponential time. 
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Abstract. We study a versatile model of evolving interactive comput- 
ing: lineages of automata. A lineage consists of a sequence of interactive 
finite automata, with a mechanism of passing information from each au- 
tomaton to its immediate successor. Lineages enable a definition of a 
suitable complexity measure for evolving systems. We show several com- 
plexity results, including a hierarchy result. 



1 Introduction 

It is commonly recognised that the Turing machine model, which has long been 
the basis for theoretical computer science, fails to capture all the characteristics 
of modern day computing (cf. [5, 11, 13]). When we think of a modern networked 
computing system, we see a device that can interact with its environment and 
that can be changed over time (by installing new hardware or upgrading the 
software). The system is ‘always on’ and a computation on such a device can 
extend arbitrarily, which implies the existence of potentially infinite computa- 
tions. Networked machines and the programs that run on them are examples of 
evolving interactive systems ([4,6]). Further examples are given in [14, 15]. 

If we look at a (deterministic) computation (or a transduction) from a tra- 
ditional point of view, the entire input and the underlying system are fixed the 
moment we start the computation. In a more realistic setting, we want a user 
to generate the input interactively and allow the user to alter the system’s be- 
haviour during the computation. Last but not least, computations can be poten- 
tially never-ending. Systems that allow these kinds of computations are called 
evolving interactive systems. In this paper, we define lineages of automata, a 
simple yet elegant model which captures the evolving aspect of computational 
systems in a natural way. It turns out that even this simple model is more power- 
ful than classical Turing machines. This was observed in [4, 6], where lineages of 
automata were shown to be equivalent to so-called interactive Turing machines 
with advice. The latter machines are known to possess super-Turing computing 
power. 

A lineage is a sequence of interactive, finite automata with a mechanism of 
passing information from each automaton to its immediate successor and the 
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potential to process infinite input streams. Every automaton in the sequence 
can be seen as a temporary instantiation of the system, before it changes into a 
different automaton. We study the properties of lineages through the translations 
they realize. Lineages of interactive finite automata (or: transducers) have been 
introduced in [6] . 

The concept of transducers acting on infinite input streams (w-transducers) 
is not new. For example, [9] gives an overview of the theory of finite devices 
that operate on infinite objects. In the field of non-uniform complexity theory, 
sequences of computing devices are common-place ([!]). It is the idea of com- 
bining these concepts and allowing some form of communication between the 
devices in the sequence that is new and that allows for a closer modelling of sys- 
tem evolution. The approach leads to several new fundamental questions that 
are settled in this paper. 

We can measure the “speed of growth” (i.e. “growth complexity”) of a lin- 
eage by a function that relates the index of each automaton to its size. That 
is, the complexity of a lineage is a function g such that the n-th automaton in 
the sequence has g{n) states. Using this measure, we can divide the translations 
computed by evolving systems into classes based on the complexity of the lin- 
eages that realize them. Our main result states that this division is non-trivial 
and leads to strict hierarchies, i.e. for every positive, non-decreasing function g, 
there is a translation that can be realized by a lineage of complexity g, but not 
by any lineage of lower complexity. 

The structure of the paper is as follows. In section 2, we define lineages. 

Next, in section 3, we define a novel measure of complexity on translations 
and establish the hierarchy result. 

1.1 Notation 

In most literature, the term transducer is used to denote an automaton with 
output capability. Every time we use the word automaton, the term transducer 
could be substituted for it without ill effects. 

We use S and 17 to denote alphabets. We use the notations S*, for the 
sets of finite and infinite strings over S respectively and S°° for the union of 
S* and . We call a partial function from S°° to I7°° a translation. 

For a string x S of length at least n, we denote the n-th symbol of x by 
Xn, or {x)n, to improve readability. We write x^i-^ for XiXi+i . . .Xj. We also use 
the projection functions for tuples, 7t„, which are defined straightforwardly, i.e., 
7T„ ‘returns’ the n-th component of a tuple that serves as the argument of 7r„, 
for any n. 

Let I? be a subset of S°° . We define the n-th prefix domain P'^{D) as the 
set of all prefixes of length n of strings in D, 

P^{D) = { \ x&D} . (1) 

We define a topology on as follows. Let n be a finite string over S. Then 
the set of all possible extensions of u, 

B(m) := { a: G \ u is a prefix of a: } 



(2) 
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is a basis set. Let S' be a subset of E°°. We call S an open set if it is a union of 
basis sets. A set is closed if it is the complement of an open set. 

2 Modelling Evolving Interactive Systems by Lineages 

As we explained in the introduction, the models of classical computability theory 
are not sufficient to capture all aspects of modern computing systems. This calls 
for theories that better describe these systems. Various extensions of classical 
models of computation have been studied that capture non-classical features of 
modern systems in some way (cf. [4, 12]). In this paper, we develop a new model 
for computing systems, initially outlined in [6]: lineages, inspired by a similar 
notion in evolutionary biology. 

The building blocks of the model are a generalisation of Mealy automata. 
These automata process potentially infinite input streams and produce poten- 
tially infinite output streams, one symbol at a time. We do not assume that the 
input is provided on an input tape. Instead, the automaton reads its input from 
a single input port. One symbol is read from this port at each step. Similarly, 
the output goes to a single output port, one symbol at a time. In contrast to 
classical models, the input stream does not have to be known in advance, and 
can be adjusted at any time by an external agent, based on previous in- and 
output symbols. This allows the environment to interact with the automaton. 

We model the evolutionary aspects by considering sequences of automata. 
Each automaton in the sequence represents the next evolutionary phase of the 
system. The way in which this sequence develops need not be described recur- 
sively in general. When a transition occurs from one automaton to its successor, 
the information that the automaton has accumulated over time must be pre- 
served in some way. This is done by requiring that every automaton has a subset 
of its states in common with its immediate successor. 

Definition 1. An automaton is a 6-tuple A = {S, f2,Q, 1 ,0,6), where S and 
fi are non-empty finite alphabets, Q is a set of states, I and O are subsets of 
Q, and 6 :QxS^Qxf2 is a (partial) transition function. E is the input 
alphabet, and fl is the output alphabet. We call I the set of entry states and O 
the set of exit states. 

Definition 2. Let A be a sequence of automata A\, A 2 , . . . , with the automata 
Ai = {E,fi,Qi,Ii,Oi,6i) such that Oi C for every i. We call A a lineage 
of automata, or a lineage for short. 

We do not require a recursive recipe for constructing the sequences of automata. 
The elements in Qi ~ O* are called local states (of Ai). The first automaton, Ai, 
has an initial state qin € Ii. Usually, Ii contains only the initial state of Ai, and 
A+i equals Oi. See Fig. 1 for an example. 

A lineage A operates on elements of S°°. On an input string x, at any time, 
only one automaton processes x. The automaton that processes a: at a particular 
time is called active (at that time). Initially, Ai is the active automaton, and it 
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Fig. 1. Part of a lineage A. The set of exit states of Ai is a subset of the set of entry 
states of A2 



starts processing x. Whenever an active automaton Ai enters an exit state q, it 
turns the control over to Ai^i, which then becomes the active automaton. This 
is done by letting start processing the remainder of x, beginning in state q 
(which is an entry state of Ai+i by definition). This is called updating, and Ai 
is the z-th update of A. 

Formally, let Q be the union of all Qi and let x G be an input to a 
lineage A. Using simultaneous recursion, we define a sequence of states {qj)j>i 
in Q and a sequence of integers with mj representing the index of the 

active automaton at time j, as follows: 



qi 


— 9in ) 








mi 


= 1 


5 








9i+i 


— 7Ti {^Srrij {QJ 5 


Xj)) , 










1 TOj + 1 


if Qj+i 
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I rrij 


if dj+i 
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Qirij Om 




\ 


[ undefined if 


is 


undefined 



Note that qj+i and nij+i depend on X[i,jp Therefore, we also write qj+i{x\i,j]) 
and mj+i{x[i,j]) to emphasize the dependence. If qj is defined for every j < 
|x| + 1, then we say that a: is a valid input to A. In this case, the output of A 
on X is the string y G such that yj = 7T2 {6raj{qj,Xj)), for every j > 1. 

Definition 3. Let A he a lineage. For every n>l, we define the partial function 
. jjn 0-^>"(a;) Jje the output of A on X if X is a valid input 

and undefined otherwise, for every x of length n. We say that is realized 
by the lineage A. In general, for a partial function ip : U” ^ 17”, we say that ip 
is realizable, if there is a lineage A such that ip equals . 
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Since there are only finitely many strings of length n, for any lineage A and 
integer n, the translation can be realized by a single finite-state automaton, 
which justifies the definition. There is no need to restrict our attention to finite 
strings. 

Definition 4. Let A be a lineage. We define the partial function : S°° 
n°° by letting L>^{x) be the output of A on x if x is a valid input and undefined 
otherwise, for every infinite string x. We say that is non-uniformly realized 
by the lineage A. In general, for a partial function 'A : — > 17°°, we say that 

W is non-uniformly realizable, if there is a lineage A such that 'A equals . 

For many lineages A, the translation is not realizable by a single finite- 
state automaton and not even by a Turing transducer, see [10] for details. For 
the remainder of this paper, we consider translations on infinite strings, unless 
stated otherwise. See also [6]. 

Let ^ be a translation and n an integer. We say that depends only on the 

first n input symbols, if there is a function / : Ff" ^ 17, such that {d>{x))^ equals 
f{x[i,n\) for every x in the domain of A non-uniformly realizable translation 
has this property for every n, since 

(^(a;))„ = 7r2(<5,„„(g„,a;„)) =7T2 . (4) 

Let A = Ai, A 2 , A 3 , . . . be a lineage of automata and let n and m be integers. 
In a slight abuse of notation, we say that A^ is able to process all strings of 
length n, if m„(a:[i:„]) < m for every string x. In other words, if for any string 
X, the lineage A needs less than m updates to process the first n symbols of x. 

2.1 Alternative Characterisations of Lineages 

In [10], we show that lineages of automata are equivalent to interactive Turing 
Machines with advice. Among other things, this shows that the theory of lineages 
has deep connections with non-uniformity theories. 

The following two Propositions provide a useful characterisation of non- 
uniformly realizable translations. 

Proposition 1. Let be a non-uniformly realizable translation. Then the do- 
main of <P is closed and depends only on the first n input symbols, for 

every n. 

Proof. Let D be the domain of d> and let A be a lineage that non-uniformly 
realizes d>. Let a: ^ D be an infinite string and consider a run of A on x. Because 
X is not in the domain of d>, it must be the case that at some point, after 
processing a prefix of length n — 1 of a;, the automaton that is active at that 
time, say A^, is in a certain state q, such that there is no transition from q with 
Xn as input. If this moment would not occur, then A would never halt during 
the run, and x would be in the domain. 



® Where (^)„ is shorthand for the function x 
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Let y be a string in and consider a run of A on y. Since the first n 

symbols of x and y are the same, the computation will halt at the same point 
during the computation, so y cannot be in the domain of Hence does 

not intersect D, which implies that D is closed. 

It follows from (4), that (^(a;))„ depends only on the first n input symbols. 

□ 



Proposition 2. Lei <L> he a translation. Suppose the domain D of <P is closed 
and (d>)^ depends only on the first n input symbols, for every n. Then is 
non-uniformly realizable by a lineage of size |P"(Z?)|. 



Proof. Let D be the domain of T>. Let /„ be the function with domain 
such that = fn{x[i-,n]) for every x G D. By the assumptions of the 

Proposition, /„ is well-defined. 

We construct a lineage A that non-uniformly realizes Every state of A will 
correspond to a prefix m of a string in D. We label the corresponding state by 
|m] . Define the set of states of for n > 1 by 



In ={M I } , 

= { H I UG P-{D) } . 



(5) 



The trick is to choose the states such that is a subset of 0„ for every n. The 
initial state of Ai is [[ej. The transition function (5„ is defined by 



A lb, 11 ni - / (M’ /"(““)) if S P”(D) 

nUi JIj ) undefined otherwise 



(6) 



By the definition of /„, we infer that the lineage A produces <P{x) on input 
x G D. 

Suppose on the other hand that x ^ D. Since D is closed, there is a basis set 
B(u) that contains x, which does not intersect D. It follows that u ^ pl"l(D). 
Since the transition functions are not defined on u, we see that x is not a valid 
input to A. Hence A non-uniformly realizes T>. Note that A is of size |P"(D)|. 

□ 



3 The Complexity of Lineages 

In this section, we develop the notion of complexity of lineages and establish 
several results which allow us to compare many translations, based on the com- 
plexity of the lineages that non-uniformly realize them. 

The processing power of an automaton is directly related to the number of 
states. An automaton with more states is able to distinguish among a greater 
number of different situations. It can apply different actions to each situation it 
can recognise, thus adding more diversity to a computation. 

For a lineage, which is a sequence of automata, the number of states of each 
of the constituent automata contributes to the computing power of the lineage. 
Therefore, we use a function to describe the complexity of a lineage. 
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Definition 5. The complexity of a lineage A is a function g such that for every 
n, the number of states of An equals g{n). We say that a translation <P is of 
complexity g if there is a lineage A of complexity g that non-uniformly realizes <T. 
We define the complexity class SIZE{g) as the class of non-uniformly realizable 
translations of complexity g or less. 

First, we give an upper bound on the complexity of any non-uniformly real- 
izable translation. 

Proposition 3. Let <P be a non-uniformly realizable translation over an alphabet 
of size c. Then <P can be non-uniformly realized by a lineage of size at most c". 

Proof. By Proposition 1 and Proposition 2, we can obtain a lineage A of size 
|P"(I?)| that non-uniformly realizes T>. There are at most c" strings of length n, 
so |P”(D)| < c”. 

□ 



3.1 Complexity Classes and the Functions That Represent Them 

We have expressed the complexity of an evolving interactive system by a pos- 
itive integer- valued function. Conversely, we ask which positive integer- valued 
functions represent a complexity class. Some functions do not naturally corre- 
spond to a complexity class, e.g. the super-exponential functions (Proposition 
3). If a function is non-decreasing and has a “growth rate”^ that is bounded by 
a constant, then it corresponds to a complexity class. If it is not, then we take a 
suitable function that is nowhere greater than the original function and consider 
its corresponding class. This is made precise below. 

Let g : IN ^ IN be an arbitrary positive non-decreasing function and c a 
positive integer. Define the function gdri) by 

gc(l) =min{ 5 (I) ,c } 

gc(n-l- 1) = min{ 5 (n-|- 1) ,C' 5 c(n) } 

It follows that for every n, 

- gc{n) < g{n), 

- gc{n) < c", 

- gc{n) < gc{n -\- 1), and 

- gc{n-\- 1 ) < c- gc{n). 

In other words, gc is a positive, non-decreasing function that is bounded by g 
and c”, with a “growth rate” that is bounded by c. 

To show that the class SIZE((/c) is not empty, we construct a translation 
that is in this class. We also show that is not in SIZE(fi), for any 
function h such that h{n) < gc(n) for some n. The construction is based on the 
following idea: suppose 71 is a lineage that non-uniformly realizes a translation 



^ The growth rate of a function g is defined as g{n -I- l)/g{n) 
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<1>. Two input-prefixes u and v (not necessarily of the same size) are considered 
inequivalent when there is an infinite string x and an integer i such that <P{ux) 
and <P{vx) both exist and (<?(wa:))|„|_,_j yf {"^{vx )) This is impossible if A 
enters the same state after processing either u or v. Thus, to make sure that an 
automaton of a lineage has at least k states, we need to make sure that there 
are at least k inequivalent inputs to choose from, once this automaton is active. 

First, we establish the domain of alphabet of size c. For 

every n, we choose gdn) strings of length n, such that they are prefixes of the 
gc{n + 1) strings of length n -I- 1. The details are given in Construction 1. 

Construction 1. Label the letters from S as ai through Oc, and let Cn be the 
chosen subset of size gdn) of L7”. We proceed recursively. 

Cl = { Oi I i< 5c(l) } ■ (8) 

Assume C„ is chosen. Using integer division, we write gdn+d = l-gdn)+m, 
for unique integers I and m, with 0 < m < gdn)- It follows that 1 < ^ < c. Let 
Ml, . . . , Um be m different strings in C„. Now take 



Cn+i = { uUi I M G C„ A z < I } U { Ujai+i \ j < m } . (9) 

It is left to the reader to verify that C„+i contains gdn + 1) strings of length 
n -I- 1. Note that C„+i is well-defined, as either I < c or I = c and m = 0. 

We call a string in C„ a choice. Note that ua\ is a choice if m is a choice. 
Consider an infinite sequence of choices, such that each choice is a prefix of its 
successor. Such a sequence defines a unique infinite string, such that each of its 
prefixes is a choice. Let A be the set of infinite strings x such that each prefix 
of X is a choice. The translation will be defined on the domain A. 



Construction 2. Consider the family of functions fk i m 
by 



f k,l,md) 



x]d^ if 0 < fc < m 
x]^^ otherwise 



: S°° — > defined 

(10) 



for k,l,ni G IN. Let >lMxlMbea surjective function that attains each 

value infinitely often. Then the translation <P is defined by 



'^{x) = U{i),dx)U(2),2ix)U{3)Ax) ■ ■ ■ 

Finally, we define the translation by 

^ I Hx) if X G A 
^ 1 undefined otherwise 



( 11 ) 



(12) 



The translation can be non-uniformly realized by a lineage of automata. 
To prove this, we need the following two Lemmas. 

Lemma 1. A is a closed set. 
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Proof. Suppose x ^ A. Then there is a prefix m of a; such that u is not a choice. 
Let y be an infinite string in B(u). Since m is a prefix of y, it follows that y ^ A. 
We conclude that A is closed. 

□ 

Lemma 2. For every n, the function 7r„o^3’'^ depends only on the first n input 
symbols. 

Proof. Let a; G Z\ be an infinite string. The output of (P{x) consists of infinitely 
many concatenations of strings of the form /^(m),m(a^). For every integer to > 1, 
the string f.,p{m),m{x) starts at index 

'^rn= +^>m . (13) 

This string consists of multiple copies of Xk, for a certain k < m. Note that k 
does not depend on the particular choice of x. The n-th symbol of <P{x) belongs 
to ftij(m),Ta{x) for a certain to. Obviously n > im, so k < m < im ^ n. It follows 
that 7T„(^®’°(a:)) = Xk for a certain k < n. 

□ 

Combining Lemmas 1, 2 and Proposition. 2, we conclude that can be 
non-uniformly realized. Next, we will examine the complexity of proposi- 
tion. 4 shows that is of complexity gc, while Prop. 5 tells us that any lineage 
with a complexity less than gc cannot non-uniformly realize 

Proposition 4. The translation can he non-uniformly realized by a lineage 
A that updates at every step, such that An has gc{n) states. 

Proof. Let A be the lineage from the proof of Proposition. 2. We see that P'^(A) 
equals C„. It follows that A is of size gc. 

□ 

For the proof of Proposition 5, we need the following Lemma. 

Lemma 3. Let k, I and n > 1 be integers such that k,l > n. Let x and y he 
infinite strings such that x„ ^ yn- Then there is an i such that 

^^k+^{<P{x)) Tri+z{<P{y)) ■ (14) 

Proof. Assume k > 1. Let t = k — 1. Choose an integer to > I such that 
-tjj^m) = (n,t). Then f^(rn),mi.x) = fn,t,m{x) = xl+\ since n < to. It follows 
that <P{x) contains a string x^f''^', starting at an index im > to, namely the string 
ftp{m),m{x). Similarly, <L{y) contains a string y„'^*, starting at the same index, 
see Fig. 2. But then 

7r*m-et(^(a:)) yf 7Ti,„(<?(2/)) , (15) 

since Xn y-n- Now im > m > I, so im = I + i for a certain i, and im + t = 
I i {k — 1) = k i. Therefore 

^:k+^{<^{x)) ^ TTi+i{<r{y)) . (16) 

□ 
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Fig. 2. Part of the outputs of on the inputs x and y. Starting in position im, the 
outputs contain a sequence of x„’s and y„’s respectively, of size f + 1 each. As a result, 
= Xn^Vn= TTi^(<P(y)) 



Let u and v be two different choices of length n. Let u' be a choice that 
extends u and v' a choice that extends v. Finally, let x = (ai)“. It follows that 
u'x and v'x are elements of A. Since u yf u, it follows that there is an n' < n, 
such that 7Tn'(u'x) yf 7Tn'{v'x). By Lemma 3, there is an i such that 

7T\u'\+i{‘^^'‘'{u'x)) yf TT\y,\+,{<P^^‘'{v'x)) . (17) 

See Fig. 3 for a visual explanation. 






■N^ 

u 







n 



¥ 




different outputs 




Fig. 3. Two finite choices u and v, such that u yf u, are extended to choices u' and v' 
respectively. These choices are extended with the infinite string x = (ai)“. Then there 
is an integer i such that yf 7T|„/|_|_i(#®’“(u'a;)) 



Proposition 5. Let A be a lineage that non-uniformly realizes Suppose 

Am. is able to process all strings of length n. Then Am has at least gc{n) states. 

Proof. Suppose Am has less than gdn) states. Then there are two different 
choices u and v of length n, with choices u' and v' that extend u and v respec- 
tively, such that Am enters the same state r after processing either u' or v' , see 
Fig. 4. 
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Then there is an i that satisfies (17). Suppose A is in the m-th update®, in 
state r. Now we give x = (ai)‘^ as further input to A. After i steps, A enters a 
state r' with a certain output b. These last i steps are independent of the steps 
that A took to reach r. In other words, 

= 7T|„/|+i(<l>9’“(r;'a;)) = b , (18) 

which contradicts (17). It follows that Am must have at least gdn) states. 

□ 



m 




Fig. 4. The paths of two valid input prefixes u and v . A.fter processing u or n , 
enters the state r. Then the remainder of the input is processed, which equals (ai)“ 
in both cases. The rest of the path only depends on r and (ai)“, so after i steps, both 
paths enter r' and the same output symbol is generated 



Corollary 1. Let A be a lineage that non-uniformly realizes Then A„ has 
at least gdn) states. 

Proof. Since each active automaton must read at least one symbol before A can 
update, by the time A is ready to update to the n + 1-st automaton, at least n 
symbols have been read. 

□ 

For any non-decreasing function g and any integer c > 1, the complexity class 
SIZE(gc) contains the translation Furthermore, SIZE(^c) is the smallest 
complexity class that contains 



Or the m + 1-st, if r is an exit state. 



5 
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3.2 A Hierarchy Result for Complexity Classes 

For clarity, we repeat the results of the preceding section in one Proposition. 

Proposition 6. Let c be a positive integer and g a positive, non- decreasing func- 
tion such that g equals gc- Let h he a function such that h{m) < g{m) for a certain 

m. Then SLZE{g) — SLZE{h) is non-empty. 

Proof. Combine Proposition 4 and Corollary 1. 

□ 

When we are free to choose c, we can show that for any positive non- 
decreasing function g, translations exist that cannot be non-uniformly realized 
by any lineage that has less than g(n) states in its n-th automaton, for any 

n. Observe that this is a stronger claim than before; we no longer require the 
“growth rate” to be bounded. 

Theorem 1. Let g he a positive, non- decreasing function and let h he a function 
such that h{m) < g{m) for a certain m. Then SLZE{g) — SLZE{h) is non-empty. 

Proof. Let c > g(l) be an integer such that g(n-\-l) < c- g{n) for every n smaller 
than TO. Then gdn) = g{n) for all n < to. It follows that h{m) < gdjn). The 
translation can be non-uniformly realized by a lineage of size gc (Proposition 
4). Hence G SIZE(g). 

Any lineage A that non-uniformly realizes must have at least gdn) states 
in its n-th automaton (Corollary 1). Since h{m) < gdrn), it follows that A is 
not of size h. Hence ^ SIZE(/i). 

□ 

Corollary 2. Let g and h he positive non- decreasing functions such that h(ji) < 
g{n) for all n. Lf the inequality is strict for a certain n, then SLZE{h) is a proper 
subset of SLZE{g). 

Proof. By definition, any translation of complexity h is in SIZE(g). By Theorem 
1, not every translation of complexity g is in SIZE(/i). 

□ 

This means that every extra state of a lineage can be used to gain more potential 
computing power. 

Corollary 3. Let g and h he positive non- decreasing functions such that h(ji) < 
g{n) for a certain n and g{m) < h{m) for a certain to. Then the classes SLZE{g) 
and SLZE{h) are incomparable: both contain translations that do not occur in the 
other. 
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4 Conclusion 

In this paper, we have developed the theory of lineages as a new model of compu- 
tation. A lineage is a sequence of finite automata, where each automaton in the 
sequence is viewed as the next instantiation or incarnation of the evolving system 
that it models. By using lineages, one can immediately single out the evolution- 
ary aspects of the system. The development of a lineage is modelled by looking 
at the automata in the sequence, and the relation between each automaton and 
its immediate successor. 

An important characteristic of an automaton is its size. For a lineage, we de- 
fined a complexity measure based on the size of the automata in the sequence. We 
proved in Theorem 1 that lineages of higher complexity are able to non-uniformly 
realize more translations than lineages of lower complexity. Specifically, for each 
non-decreasing function g, there is a translation that can be non-uniformly re- 
alized by a lineage of complexity g, but not by any lineage that has fewer than 
g{n) states available for its n-th automaton, for a certain n. On the other hand, 
once a translation (over an input alphabet of size c) is fixed, we know that it 
can be non-uniformly realized by a lineage of complexity at most c". 

We conclude that lineages of automata present an attractive model for evolv- 
ing interactive systems, with a basic mechanism for the underlying mode of com- 
putation. The attractiveness is due to the mathematical elegance of the model 
which, in spite of its apparent simplicity, still captures the important aspects of 
many other models. The close relationship of lineages to finite automata makes 
the model even more interesting since techniques and proofs from automata 
theory can be adapted to the theory of lineages. 
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